## Summary This is the third and last PR in this stack that adds support for toggling lints at a per-rule level. This PR introduces a new `LintRegistry`, a central index of known lints. The registry is required because we want to support lint rules from many different crates but need a way to look them up by name, e.g., when resolving a lint from a name in the configuration or analyzing a suppression comment. Adding a lint now requires two steps: 1. Declare the lint with `declare_lint` 2. Register the lint in the registry inside the `register_lints` function. I considered some more involved macros to avoid changes in two places. Still, I ultimately decided against it because a) it's just two places and b) I'd expect that registering a type checker lint will differ from registering a lint that runs as a rule in the linter. I worry that any more opinionated design could limit our options when working on the linter, so I kept it simple. The second part of this PR is the `RuleSelection`. It stores which lints are enabled and what severity they should use for created diagnostics. For now, the `RuleSelection` always gets initialized with all known lints and it uses their default level. ## Linter crates Each crate that defines lints should export a `register_lints` function that accepts a `&mut LintRegistryBuilder` to register all its known lints in the registry. This should make registering all known lints in a top-level crate easy: Just call `register_lints` of every crate that defines lint rules. I considered defining a `LintCollection` trait and even some fancy macros to accomplish the same but decided to go for this very simplistic approach for now. We can add more abstraction once needed. ## Lint rules This is a bit hand-wavy. I don't have a good sense for how our linter infrastructure will look like, but I expect we'll need a way to register the rules that should run as part of the red knot linter. One way is to keep doing what Ruff does by having one massive `checker` and each lint rule adds a call to itself in the relevant AST visitor methods. An alternative is that we have a `LintRule` trait that provides common hooks and implementations will be called at the "right time". Such a design would need a way to register all known lint implementations, possibly with the lint. This is where we'd probably want a dedicated `register_rule` method. A third option is that lint rules are handled separately from the `LintRegistry` and are specific to the linter crate. The current design should be flexible enough to support the three options. ## Documentation generation The documentation for all known lints can be generated by creating a factory, registering all lints by calling the `register_lints` methods, and then querying the registry for the metadata. ## Deserialization and Schema generation I haven't fully decided what the best approach is when it comes to deserializing lint rule names: * Reject invalid names in the deserializer. This gives us error messages with line and column numbers (by serde) * Don't validate lint rule names during deserialization; defer the validation until the configuration is resolved. This gives us more control over handling the error, e.g. emit a warning diagnostic instead of aborting when a rule isn't known. One technical challenge for both deserialization and schema generation is that the `Deserialize` and `JSONSchema` traits do not allow passing the `LintRegistry`, which is required to look up the lints by name. I suggest that we either rely on the salsa db being set for the current thread (`salsa::Attach`) or build our own thread-local storage for the `LintRegistry`. It's the caller's responsibility to make the lint registry available before calling `Deserialize` or `JSONSchema`. ## CLI support I prefer deferring adding support for enabling and disabling lints from the CLI for now because I think it will be easier to add once I've figured out how to handle configurations. ## Bitset optimization Ruff tracks the enabled rules using a cheap copyable `Bitset` instead of a hash map. This helped improve performance by a few percent (see https://github.com/astral-sh/ruff/pull/3606). However, this approach is no longer possible because lints have no "cheap" way to compute their index inside the registry (other than using a hash map). We could consider doing something similar to Salsa where each `LintMetadata` stores a `LazyLintIndex`. ``` pub struct LazyLintIndex { cached: OnceLock<(Nonce, LintIndex)> } impl LazyLintIndex { pub fn get(registry: &LintRegistry, lint: &'static LintMetadata) { let (nonce, index) = self.cached.get_or_init(|| registry.lint_index(lint)); if registry.nonce() == nonce { index } else { registry.lint_index(lint) } } ``` Each registry keeps a map from `LintId` to `LintIndex` where `LintIndex` is in the range of `0...registry.len()`. The `LazyLintIndex` is based on the assumption that every program has exactly **one** registry. This assumption allows to cache the `LintIndex` directly on the `LintMetadata`. The implementation falls back to the "slow" path if there is more than one registry at runtime. I was very close to implementing this optimization because it's kind of fun to implement. I ultimately decided against it because it adds complexity and I don't think it's worth doing in Red Knot today: * Red Knot only queries the rule selection when deciding whether or not to emit a diagnostic. It is rarely used to detect if a certain code block should run. This is different from Ruff where the rule selection is queried many times for every single AST node to determine which rules *should* run. * I'm not sure if a 2-3% performance improvement is worth the complexity I suggest revisiting this decision when working on the linter where a fast path for deciding if a rule is enabled might be more important (but that depends on how lint rules are implemented) ## Test Plan I removed a lint from the default rule registry, and the MD tests started failing because the diagnostics were no longer emitted.
466 lines
13 KiB
Rust
466 lines
13 KiB
Rust
use itertools::Itertools;
|
|
use ruff_db::diagnostic::{LintName, Severity};
|
|
use rustc_hash::FxHashMap;
|
|
use std::hash::Hasher;
|
|
use thiserror::Error;
|
|
|
|
#[derive(Debug, Clone)]
|
|
pub struct LintMetadata {
|
|
/// The unique identifier for the lint.
|
|
pub name: LintName,
|
|
|
|
/// A one-sentence summary of what the lint catches.
|
|
pub summary: &'static str,
|
|
|
|
/// An in depth explanation of the lint in markdown. Covers what the lint does, why it's bad and possible fixes.
|
|
///
|
|
/// The documentation may require post-processing to be rendered correctly. For example, lines
|
|
/// might have leading or trailing whitespace that should be removed.
|
|
pub raw_documentation: &'static str,
|
|
|
|
/// The default level of the lint if the user doesn't specify one.
|
|
pub default_level: Level,
|
|
|
|
pub status: LintStatus,
|
|
|
|
/// The source file in which the lint is declared.
|
|
pub file: &'static str,
|
|
|
|
/// The 1-based line number in the source `file` where the lint is declared.
|
|
pub line: u32,
|
|
}
|
|
|
|
#[derive(Copy, Clone, Debug, Eq, PartialEq, Ord, PartialOrd)]
|
|
pub enum Level {
|
|
/// The lint is disabled and should not run.
|
|
Ignore,
|
|
|
|
/// The lint is enabled and diagnostic should have a warning severity.
|
|
Warn,
|
|
|
|
/// The lint is enabled and diagnostics have an error severity.
|
|
Error,
|
|
}
|
|
|
|
impl Level {
|
|
pub const fn is_error(self) -> bool {
|
|
matches!(self, Level::Error)
|
|
}
|
|
|
|
pub const fn is_warn(self) -> bool {
|
|
matches!(self, Level::Warn)
|
|
}
|
|
|
|
pub const fn is_ignore(self) -> bool {
|
|
matches!(self, Level::Ignore)
|
|
}
|
|
}
|
|
|
|
impl TryFrom<Level> for Severity {
|
|
type Error = ();
|
|
|
|
fn try_from(level: Level) -> Result<Self, ()> {
|
|
match level {
|
|
Level::Ignore => Err(()),
|
|
Level::Warn => Ok(Severity::Warning),
|
|
Level::Error => Ok(Severity::Error),
|
|
}
|
|
}
|
|
}
|
|
|
|
impl LintMetadata {
|
|
pub fn name(&self) -> LintName {
|
|
self.name
|
|
}
|
|
|
|
pub fn summary(&self) -> &str {
|
|
self.summary
|
|
}
|
|
|
|
/// Returns the documentation line by line with one leading space and all trailing whitespace removed.
|
|
pub fn documentation_lines(&self) -> impl Iterator<Item = &str> {
|
|
self.raw_documentation
|
|
.lines()
|
|
.map(|line| line.strip_prefix(' ').unwrap_or(line).trim_end())
|
|
}
|
|
|
|
/// Returns the documentation as a single string.
|
|
pub fn documentation(&self) -> String {
|
|
self.documentation_lines().join("\n")
|
|
}
|
|
|
|
pub fn default_level(&self) -> Level {
|
|
self.default_level
|
|
}
|
|
|
|
pub fn status(&self) -> &LintStatus {
|
|
&self.status
|
|
}
|
|
|
|
pub fn file(&self) -> &str {
|
|
self.file
|
|
}
|
|
|
|
pub fn line(&self) -> u32 {
|
|
self.line
|
|
}
|
|
}
|
|
|
|
#[doc(hidden)]
|
|
pub const fn lint_metadata_defaults() -> LintMetadata {
|
|
LintMetadata {
|
|
name: LintName::of(""),
|
|
summary: "",
|
|
raw_documentation: "",
|
|
default_level: Level::Error,
|
|
status: LintStatus::preview("0.0.0"),
|
|
file: "",
|
|
line: 1,
|
|
}
|
|
}
|
|
|
|
#[derive(Copy, Clone, Debug)]
|
|
pub enum LintStatus {
|
|
/// The lint has been added to the linter, but is not yet stable.
|
|
Preview {
|
|
/// The version in which the lint was added.
|
|
since: &'static str,
|
|
},
|
|
|
|
/// The lint is stable.
|
|
Stable {
|
|
/// The version in which the lint was stabilized.
|
|
since: &'static str,
|
|
},
|
|
|
|
/// The lint is deprecated and no longer recommended for use.
|
|
Deprecated {
|
|
/// The version in which the lint was deprecated.
|
|
since: &'static str,
|
|
|
|
/// The reason why the lint has been deprecated.
|
|
///
|
|
/// This should explain why the lint has been deprecated and if there's a replacement lint that users
|
|
/// can use instead.
|
|
reason: &'static str,
|
|
},
|
|
|
|
/// The lint has been removed and can no longer be used.
|
|
Removed {
|
|
/// The version in which the lint was removed.
|
|
since: &'static str,
|
|
|
|
/// The reason why the lint has been removed.
|
|
reason: &'static str,
|
|
},
|
|
}
|
|
|
|
impl LintStatus {
|
|
pub const fn preview(since: &'static str) -> Self {
|
|
LintStatus::Preview { since }
|
|
}
|
|
|
|
pub const fn stable(since: &'static str) -> Self {
|
|
LintStatus::Stable { since }
|
|
}
|
|
|
|
pub const fn deprecated(since: &'static str, reason: &'static str) -> Self {
|
|
LintStatus::Deprecated { since, reason }
|
|
}
|
|
|
|
pub const fn removed(since: &'static str, reason: &'static str) -> Self {
|
|
LintStatus::Removed { since, reason }
|
|
}
|
|
|
|
pub const fn is_removed(&self) -> bool {
|
|
matches!(self, LintStatus::Removed { .. })
|
|
}
|
|
}
|
|
|
|
/// Declares a lint rule with the given metadata.
|
|
///
|
|
/// ```rust
|
|
/// use red_knot_python_semantic::declare_lint;
|
|
/// use red_knot_python_semantic::lint::{LintStatus, Level};
|
|
///
|
|
/// declare_lint! {
|
|
/// /// ## What it does
|
|
/// /// Checks for references to names that are not defined.
|
|
/// ///
|
|
/// /// ## Why is this bad?
|
|
/// /// Using an undefined variable will raise a `NameError` at runtime.
|
|
/// ///
|
|
/// /// ## Example
|
|
/// ///
|
|
/// /// ```python
|
|
/// /// print(x) # NameError: name 'x' is not defined
|
|
/// /// ```
|
|
/// pub(crate) static UNRESOLVED_REFERENCE = {
|
|
/// summary: "detects references to names that are not defined",
|
|
/// status: LintStatus::preview("1.0.0"),
|
|
/// default_level: Level::Warn,
|
|
/// }
|
|
/// }
|
|
/// ```
|
|
#[macro_export]
|
|
macro_rules! declare_lint {
|
|
(
|
|
$(#[doc = $doc:literal])+
|
|
$vis: vis static $name: ident = {
|
|
summary: $summary: literal,
|
|
status: $status: expr,
|
|
// Optional properties
|
|
$( $key:ident: $value:expr, )*
|
|
}
|
|
) => {
|
|
$( #[doc = $doc] )+
|
|
#[allow(clippy::needless_update)]
|
|
$vis static $name: $crate::lint::LintMetadata = $crate::lint::LintMetadata {
|
|
name: ruff_db::diagnostic::LintName::of(ruff_macros::kebab_case!($name)),
|
|
summary: $summary,
|
|
raw_documentation: concat!($($doc,)+ "\n"),
|
|
status: $status,
|
|
file: file!(),
|
|
line: line!(),
|
|
$( $key: $value, )*
|
|
..$crate::lint::lint_metadata_defaults()
|
|
};
|
|
};
|
|
}
|
|
|
|
/// A unique identifier for a lint rule.
|
|
///
|
|
/// Implements `PartialEq`, `Eq`, and `Hash` based on the `LintMetadata` pointer
|
|
/// for fast comparison and lookup.
|
|
#[derive(Debug, Clone, Copy)]
|
|
pub struct LintId {
|
|
definition: &'static LintMetadata,
|
|
}
|
|
|
|
impl LintId {
|
|
pub const fn of(definition: &'static LintMetadata) -> Self {
|
|
LintId { definition }
|
|
}
|
|
}
|
|
|
|
impl PartialEq for LintId {
|
|
fn eq(&self, other: &Self) -> bool {
|
|
std::ptr::eq(self.definition, other.definition)
|
|
}
|
|
}
|
|
|
|
impl Eq for LintId {}
|
|
|
|
impl std::hash::Hash for LintId {
|
|
fn hash<H: Hasher>(&self, state: &mut H) {
|
|
std::ptr::hash(self.definition, state);
|
|
}
|
|
}
|
|
|
|
impl std::ops::Deref for LintId {
|
|
type Target = LintMetadata;
|
|
|
|
fn deref(&self) -> &Self::Target {
|
|
self.definition
|
|
}
|
|
}
|
|
|
|
#[derive(Default, Debug)]
|
|
pub struct LintRegistryBuilder {
|
|
/// Registered lints that haven't been removed.
|
|
lints: Vec<LintId>,
|
|
|
|
/// Lints indexed by name, including aliases and removed rules.
|
|
by_name: FxHashMap<&'static str, LintEntry>,
|
|
}
|
|
|
|
impl LintRegistryBuilder {
|
|
#[track_caller]
|
|
pub fn register_lint(&mut self, lint: &'static LintMetadata) {
|
|
assert_eq!(
|
|
self.by_name.insert(&*lint.name, lint.into()),
|
|
None,
|
|
"duplicate lint registration for '{name}'",
|
|
name = lint.name
|
|
);
|
|
|
|
if !lint.status.is_removed() {
|
|
self.lints.push(LintId::of(lint));
|
|
}
|
|
}
|
|
|
|
#[track_caller]
|
|
pub fn register_alias(&mut self, from: LintName, to: &'static LintMetadata) {
|
|
let target = match self.by_name.get(to.name.as_str()) {
|
|
Some(LintEntry::Lint(target) | LintEntry::Removed(target)) => target,
|
|
Some(LintEntry::Alias(target)) => {
|
|
panic!(
|
|
"lint alias {from} -> {to:?} points to another alias {target:?}",
|
|
target = target.name()
|
|
)
|
|
}
|
|
None => panic!(
|
|
"lint alias {from} -> {to} points to non-registered lint",
|
|
to = to.name
|
|
),
|
|
};
|
|
|
|
assert_eq!(
|
|
self.by_name
|
|
.insert(from.as_str(), LintEntry::Alias(*target)),
|
|
None,
|
|
"duplicate lint registration for '{from}'",
|
|
);
|
|
}
|
|
|
|
pub fn build(self) -> LintRegistry {
|
|
LintRegistry {
|
|
lints: self.lints,
|
|
by_name: self.by_name,
|
|
}
|
|
}
|
|
}
|
|
|
|
#[derive(Default, Debug)]
|
|
pub struct LintRegistry {
|
|
lints: Vec<LintId>,
|
|
by_name: FxHashMap<&'static str, LintEntry>,
|
|
}
|
|
|
|
impl LintRegistry {
|
|
/// Looks up a lint by its name.
|
|
pub fn get(&self, code: &str) -> Result<LintId, GetLintError> {
|
|
match self.by_name.get(code) {
|
|
Some(LintEntry::Lint(metadata)) => Ok(*metadata),
|
|
Some(LintEntry::Alias(lint)) => {
|
|
if lint.status.is_removed() {
|
|
Err(GetLintError::Removed(lint.name()))
|
|
} else {
|
|
Ok(*lint)
|
|
}
|
|
}
|
|
Some(LintEntry::Removed(lint)) => Err(GetLintError::Removed(lint.name())),
|
|
None => Err(GetLintError::Unknown(code.to_string())),
|
|
}
|
|
}
|
|
|
|
/// Returns all registered, non-removed lints.
|
|
pub fn lints(&self) -> &[LintId] {
|
|
&self.lints
|
|
}
|
|
|
|
/// Returns an iterator over all known aliases and to their target lints.
|
|
///
|
|
/// This iterator includes aliases that point to removed lints.
|
|
pub fn aliases(&self) -> impl Iterator<Item = (LintName, LintId)> + '_ {
|
|
self.by_name.iter().filter_map(|(key, value)| {
|
|
if let LintEntry::Alias(alias) = value {
|
|
Some((LintName::of(key), *alias))
|
|
} else {
|
|
None
|
|
}
|
|
})
|
|
}
|
|
|
|
/// Iterates over all removed lints.
|
|
pub fn removed(&self) -> impl Iterator<Item = LintId> + '_ {
|
|
self.by_name.iter().filter_map(|(_, value)| {
|
|
if let LintEntry::Removed(metadata) = value {
|
|
Some(*metadata)
|
|
} else {
|
|
None
|
|
}
|
|
})
|
|
}
|
|
}
|
|
|
|
#[derive(Error, Debug, Clone)]
|
|
pub enum GetLintError {
|
|
/// The name maps to this removed lint.
|
|
#[error("lint {0} has been removed")]
|
|
Removed(LintName),
|
|
|
|
/// No lint with the given name is known.
|
|
#[error("unknown lint {0}")]
|
|
Unknown(String),
|
|
}
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
|
pub enum LintEntry {
|
|
/// An existing lint rule. Can be in preview, stable or deprecated.
|
|
Lint(LintId),
|
|
/// A lint rule that has been removed.
|
|
Removed(LintId),
|
|
Alias(LintId),
|
|
}
|
|
|
|
impl From<&'static LintMetadata> for LintEntry {
|
|
fn from(metadata: &'static LintMetadata) -> Self {
|
|
if metadata.status.is_removed() {
|
|
LintEntry::Removed(LintId::of(metadata))
|
|
} else {
|
|
LintEntry::Lint(LintId::of(metadata))
|
|
}
|
|
}
|
|
}
|
|
|
|
#[derive(Debug, Clone, Default)]
|
|
pub struct RuleSelection {
|
|
/// Map with the severity for each enabled lint rule.
|
|
///
|
|
/// If a rule isn't present in this map, then it should be considered disabled.
|
|
lints: FxHashMap<LintId, Severity>,
|
|
}
|
|
|
|
impl RuleSelection {
|
|
/// Creates a new rule selection from all known lints in the registry that are enabled
|
|
/// according to their default severity.
|
|
pub fn from_registry(registry: &LintRegistry) -> Self {
|
|
let lints = registry
|
|
.lints()
|
|
.iter()
|
|
.filter_map(|lint| {
|
|
Severity::try_from(lint.default_level())
|
|
.ok()
|
|
.map(|severity| (*lint, severity))
|
|
})
|
|
.collect();
|
|
|
|
RuleSelection { lints }
|
|
}
|
|
|
|
/// Returns an iterator over all enabled lints.
|
|
pub fn enabled(&self) -> impl Iterator<Item = LintId> + '_ {
|
|
self.lints.keys().copied()
|
|
}
|
|
|
|
/// Returns an iterator over all enabled lints and their severity.
|
|
pub fn iter(&self) -> impl ExactSizeIterator<Item = (LintId, Severity)> + '_ {
|
|
self.lints.iter().map(|(&lint, &severity)| (lint, severity))
|
|
}
|
|
|
|
/// Returns the configured severity for the lint with the given id or `None` if the lint is disabled.
|
|
pub fn severity(&self, lint: LintId) -> Option<Severity> {
|
|
self.lints.get(&lint).copied()
|
|
}
|
|
|
|
/// Enables `lint` and configures with the given `severity`.
|
|
///
|
|
/// Overrides any previous configuration for the lint.
|
|
pub fn enable(&mut self, lint: LintId, severity: Severity) {
|
|
self.lints.insert(lint, severity);
|
|
}
|
|
|
|
/// Disables `lint` if it was previously enabled.
|
|
pub fn disable(&mut self, lint: LintId) {
|
|
self.lints.remove(&lint);
|
|
}
|
|
|
|
/// Merges the enabled lints from `other` into this selection.
|
|
///
|
|
/// Lints from `other` will override any existing configuration.
|
|
pub fn merge(&mut self, other: &RuleSelection) {
|
|
self.lints.extend(other.iter());
|
|
}
|
|
}
|