Compare commits

..

11 Commits

Author SHA1 Message Date
Dhruv Manilawala
0ff25a540c Bump version to 0.4.1 (#11035)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-04-19 17:42:02 +05:30
Alex Waygood
34873ec009 Add a script to fuzz the parser (courtesy of pysource-codegen) (#11015) 2024-04-19 12:40:36 +01:00
Dhruv Manilawala
d3cd61f804 Use empty range when there's "gap" in token source (#11032)
## Summary

This fixes a bug where the parser would panic when there is a "gap" in
the token source.

What's a gap?

The reason it's `<=` instead of just `==` is because there could be
whitespaces between
the two tokens. For example:

```python
#     last token end
#     | current token (newline) start
#     v v
def foo \n
#      ^
#      assume there's trailing whitespace here
```

Or, there could tokens that are considered "trivia" and thus aren't
emitted by the token
source. These are comments and non-logical newlines. For example:

```python
#     last token end
#     v
def foo # comment\n
#                ^ current token (newline) start
```

In either of the above cases, there's a "gap" between the end of the
last token and start
of the current token.

## Test Plan

Add test cases and update the snapshots.
2024-04-19 11:36:26 +00:00
Alex Waygood
9b80cc09ee Select fewer ruff rules when linting Python files in scripts/ (#11034) 2024-04-19 12:33:36 +01:00
Dhruv Manilawala
9bb23b0a38 Expect indented case block instead of match stmt (#11033)
## Summary

This PR adds a new `Clause::Case` and uses it to parse the body of a
`case` block. Earlier, it was using `Match` which would give an
incorrect error message like:

```
  |
1 | match subject:
2 |     case 1:
3 |     case 2: ...
  |     ^^^^ Syntax Error: Expected an indented block after `match` statement
  |
```

## Test Plan

Add test case and update the snapshot.
2024-04-19 16:46:15 +05:30
Charlie Marsh
06c248a126 [ruff] Ignore stub functions in unused-async (RUF029) (#11026)
## Summary

We should ignore methods that appear to be stubs, e.g.:

```python
async def foo() -> int: ...
```

Closes https://github.com/astral-sh/ruff/issues/11018.
2024-04-19 00:03:52 -04:00
Tibor Reiss
27902b7130 [pylint] Implement invalid-index-returned (PLE0305) (#10962)
Add pylint rule invalid-index-returned (PLE0305)

See https://github.com/astral-sh/ruff/issues/970 for rules

Test Plan: `cargo test`
2024-04-19 03:44:05 +00:00
Henry Asa
97acf1d59b ENH: Bump ruff dependency versions to support the latest release of v0.4.0 and Python 3.12 (#11025)
## Summary

With the release of
[`v0.4.0`](https://github.com/astral-sh/ruff/releases/tag/v0.4.0) of
`ruff`, I noticed that some of `ruff`'s dependencies were not updated to
their latest versions. The
[`ruff-pre-commit`](https://github.com/astral-sh/ruff-pre-commit)
package released
[`v0.4.0`](https://github.com/astral-sh/ruff-pre-commit/releases/tag/v0.4.0)
at the same time `ruff` was updated, but `ruff` still referenced
`v0.3.7` of the package, not the newly updated version. I updated the
`ruff-pre-commit` reference to be `v0.4.0`.

In a similar light, I noticed that the version of the
[`dill`](https://github.com/uqfoundation/dill) package being used was
not the latest version. I bumped `dill` from version `0.3.7` to `0.3.8`,
which now [fully supports Python
3.12](https://github.com/uqfoundation/dill/releases/tag/0.3.8).

## Related Issues

Resolves #11024
2024-04-19 03:37:54 +00:00
Tibor Reiss
adf63d9013 [pylint] Implement invalid-hash-returned (PLE0309) (#10961)
Add pylint rule invalid-hash-returned (PLE0309)

See https://github.com/astral-sh/ruff/issues/970 for rules

Test Plan: `cargo test`

TBD: from the description: "Strictly speaking `bool` is a subclass of
`int`, thus returning `True`/`False` is valid. To be consistent with
other rules (e.g.
[PLE0305](https://github.com/astral-sh/ruff/pull/10962)
invalid-index-returned), ruff will raise, compared to pylint which will
not raise."
2024-04-19 03:33:52 +00:00
MithicSpirit
5d3c9f2637 ruff server: fix Neovim setup guide command (#11021) 2024-04-19 08:24:09 +05:30
Charlie Marsh
33529c049e Allow NoReturn-like functions for __str__, __len__, etc. (#11017)
## Summary

If the method always raises, we shouldn't raise a diagnostic for
"returning a value of the wrong type".

Closes https://github.com/astral-sh/ruff/issues/11016.
2024-04-18 22:55:15 +00:00
49 changed files with 1199 additions and 155 deletions

View File

@@ -55,7 +55,7 @@ repos:
pass_filenames: false # This makes it a lot faster
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.7
rev: v0.4.0
hooks:
- id: ruff-format
- id: ruff

View File

@@ -1,5 +1,19 @@
# Changelog
## 0.4.1
### Preview features
- \[`pylint`\] Implement `invalid-hash-returned` (`PLE0309`) ([#10961](https://github.com/astral-sh/ruff/pull/10961))
- \[`pylint`\] Implement `invalid-index-returned` (`PLE0305`) ([#10962](https://github.com/astral-sh/ruff/pull/10962))
### Bug fixes
- \[`pylint`\] Allow `NoReturn`-like functions for `__str__`, `__len__`, etc. (`PLE0307`) ([#11017](https://github.com/astral-sh/ruff/pull/11017))
- Parser: Use empty range when there's "gap" in token source ([#11032](https://github.com/astral-sh/ruff/pull/11032))
- \[`ruff`\] Ignore stub functions in `unused-async` (`RUF029`) ([#11026](https://github.com/astral-sh/ruff/pull/11026))
- Parser: Expect indented case block instead of match stmt ([#11033](https://github.com/astral-sh/ruff/pull/11033))
## 0.4.0
### A new, hand-written parser

6
Cargo.lock generated
View File

@@ -1835,7 +1835,7 @@ dependencies = [
[[package]]
name = "ruff"
version = "0.4.0"
version = "0.4.1"
dependencies = [
"anyhow",
"argfile",
@@ -1997,7 +1997,7 @@ dependencies = [
[[package]]
name = "ruff_linter"
version = "0.4.0"
version = "0.4.1"
dependencies = [
"aho-corasick",
"annotate-snippets 0.9.2",
@@ -2272,7 +2272,7 @@ dependencies = [
[[package]]
name = "ruff_shrinking"
version = "0.4.0"
version = "0.4.1"
dependencies = [
"anyhow",
"clap",

View File

@@ -151,7 +151,7 @@ Ruff can also be used as a [pre-commit](https://pre-commit.com/) hook via [`ruff
```yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.4.0
rev: v0.4.1
hooks:
# Run the linter.
- id: ruff

View File

@@ -1,6 +1,6 @@
[package]
name = "ruff"
version = "0.4.0"
version = "0.4.1"
publish = false
authors = { workspace = true }
edition = { workspace = true }

View File

@@ -1,6 +1,6 @@
[package]
name = "ruff_linter"
version = "0.4.0"
version = "0.4.1"
publish = false
authors = { workspace = true }
edition = { workspace = true }

View File

@@ -21,12 +21,6 @@ class BytesNoReturn:
print("ruff") # [invalid-bytes-return]
class BytesWrongRaise:
def __bytes__(self):
print("raise some error")
raise NotImplementedError # [invalid-bytes-return]
# TODO: Once Ruff has better type checking
def return_bytes():
return "some string"
@@ -63,3 +57,9 @@ class Bytes4:
class Bytes5:
def __bytes__(self):
raise NotImplementedError
class Bytes6:
def __bytes__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -0,0 +1,65 @@
# These testcases should raise errors
class Bool:
def __hash__(self):
return True # [invalid-hash-return]
class Float:
def __hash__(self):
return 3.05 # [invalid-hash-return]
class Str:
def __hash__(self):
return "ruff" # [invalid-hash-return]
class HashNoReturn:
def __hash__(self):
print("ruff") # [invalid-hash-return]
# TODO: Once Ruff has better type checking
def return_int():
return "3"
class ComplexReturn:
def __hash__(self):
return return_int() # [invalid-hash-return]
# These testcases should NOT raise errors
class Hash:
def __hash__(self):
return 7741
class Hash2:
def __hash__(self):
x = 7741
return x
class Hash3:
def __hash__(self): ...
class Has4:
def __hash__(self):
pass
class Hash5:
def __hash__(self):
raise NotImplementedError
class HashWrong6:
def __hash__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -0,0 +1,73 @@
# These testcases should raise errors
class Bool:
"""pylint would not raise, but ruff does - see explanation in the docs"""
def __index__(self):
return True # [invalid-index-return]
class Float:
def __index__(self):
return 3.05 # [invalid-index-return]
class Dict:
def __index__(self):
return {"1": "1"} # [invalid-index-return]
class Str:
def __index__(self):
return "ruff" # [invalid-index-return]
class IndexNoReturn:
def __index__(self):
print("ruff") # [invalid-index-return]
# TODO: Once Ruff has better type checking
def return_index():
return "3"
class ComplexReturn:
def __index__(self):
return return_index() # [invalid-index-return]
# These testcases should NOT raise errors
class Index:
def __index__(self):
return 0
class Index2:
def __index__(self):
x = 1
return x
class Index3:
def __index__(self):
...
class Index4:
def __index__(self):
pass
class Index5:
def __index__(self):
raise NotImplementedError
class Index6:
def __index__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -26,12 +26,6 @@ class LengthNegative:
return -42 # [invalid-length-return]
class LengthWrongRaise:
def __len__(self):
print("raise some error")
raise NotImplementedError # [invalid-length-return]
# TODO: Once Ruff has better type checking
def return_int():
return "3"
@@ -68,3 +62,9 @@ class Length4:
class Length5:
def __len__(self):
raise NotImplementedError
class Length6:
def __len__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -47,3 +47,14 @@ class Str2:
class Str3:
def __str__(self): ...
class Str4:
def __str__(self):
raise RuntimeError("__str__ not allowed")
class Str5:
def __str__(self): # PLE0307 (returns None if x <= 0)
if x > 0:
raise RuntimeError("__str__ not allowed")

View File

@@ -22,7 +22,7 @@ async def pass_3(): # OK: uses an async loop
class Foo:
async def pass_4(): # OK: method of a class
async def pass_4(self): # OK: method of a class
pass
@@ -31,6 +31,10 @@ def foo():
await bla
async def pass_6(): # OK: just a stub
...
async def fail_1a(): # RUF029
time.sleep(1)
@@ -58,7 +62,7 @@ async def fail_4a(): # RUF029: the /outer/ function does not await
async def fail_4b(): # RUF029: the /outer/ function does not await
class Foo:
async def foo():
async def foo(self):
await bla

View File

@@ -103,6 +103,12 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
if checker.enabled(Rule::InvalidBytesReturnType) {
pylint::rules::invalid_bytes_return(checker, function_def);
}
if checker.enabled(Rule::InvalidIndexReturnType) {
pylint::rules::invalid_index_return(checker, function_def);
}
if checker.enabled(Rule::InvalidHashReturnType) {
pylint::rules::invalid_hash_return(checker, function_def);
}
if checker.enabled(Rule::InvalidStrReturnType) {
pylint::rules::invalid_str_return(checker, function_def);
}

View File

@@ -243,8 +243,10 @@ pub fn code_to_rule(linter: Linter, code: &str) -> Option<(RuleGroup, Rule)> {
(Pylint, "E0302") => (RuleGroup::Stable, rules::pylint::rules::UnexpectedSpecialMethodSignature),
(Pylint, "E0303") => (RuleGroup::Preview, rules::pylint::rules::InvalidLengthReturnType),
(Pylint, "E0304") => (RuleGroup::Preview, rules::pylint::rules::InvalidBoolReturnType),
(Pylint, "E0305") => (RuleGroup::Preview, rules::pylint::rules::InvalidIndexReturnType),
(Pylint, "E0307") => (RuleGroup::Stable, rules::pylint::rules::InvalidStrReturnType),
(Pylint, "E0308") => (RuleGroup::Preview, rules::pylint::rules::InvalidBytesReturnType),
(Pylint, "E0309") => (RuleGroup::Preview, rules::pylint::rules::InvalidHashReturnType),
(Pylint, "E0604") => (RuleGroup::Stable, rules::pylint::rules::InvalidAllObject),
(Pylint, "E0605") => (RuleGroup::Stable, rules::pylint::rules::InvalidAllFormat),
(Pylint, "E0643") => (RuleGroup::Preview, rules::pylint::rules::PotentialIndexError),

View File

@@ -77,14 +77,19 @@ mod tests {
#[test_case(Rule::InvalidAllFormat, Path::new("invalid_all_format.py"))]
#[test_case(Rule::InvalidAllObject, Path::new("invalid_all_object.py"))]
#[test_case(Rule::InvalidBoolReturnType, Path::new("invalid_return_type_bool.py"))]
#[test_case(
Rule::InvalidLengthReturnType,
Path::new("invalid_return_type_length.py")
)]
#[test_case(
Rule::InvalidBytesReturnType,
Path::new("invalid_return_type_bytes.py")
)]
#[test_case(
Rule::InvalidIndexReturnType,
Path::new("invalid_return_type_index.py")
)]
#[test_case(Rule::InvalidHashReturnType, Path::new("invalid_return_type_hash.py"))]
#[test_case(
Rule::InvalidLengthReturnType,
Path::new("invalid_return_type_length.py")
)]
#[test_case(Rule::InvalidStrReturnType, Path::new("invalid_return_type_str.py"))]
#[test_case(Rule::DuplicateBases, Path::new("duplicate_bases.py"))]
#[test_case(Rule::InvalidCharacterBackspace, Path::new("invalid_characters.py"))]

View File

@@ -5,6 +5,7 @@ use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{self as ast};
use ruff_python_semantic::analyze::function_type::is_stub;
use ruff_python_semantic::analyze::terminal::Terminal;
use ruff_python_semantic::analyze::type_inference::{NumberLike, PythonType, ResolvedPythonType};
use ruff_text_size::Ranged;
@@ -43,7 +44,7 @@ impl Violation for InvalidBoolReturnType {
}
}
/// E0307
/// PLE0304
pub(crate) fn invalid_bool_return(checker: &mut Checker, function_def: &ast::StmtFunctionDef) {
if function_def.name.as_str() != "__bool__" {
return;
@@ -57,19 +58,29 @@ pub(crate) fn invalid_bool_return(checker: &mut Checker, function_def: &ast::Stm
return;
}
// Determine the terminal behavior (i.e., implicit return, no return, etc.).
let terminal = Terminal::from_function(function_def);
// If every control flow path raises an exception, ignore the function.
if terminal == Terminal::Raise {
return;
}
// If there are no return statements, add a diagnostic.
if terminal == Terminal::Implicit {
checker.diagnostics.push(Diagnostic::new(
InvalidBoolReturnType,
function_def.identifier(),
));
return;
}
let returns = {
let mut visitor = ReturnStatementVisitor::default();
visitor.visit_body(&function_def.body);
visitor.returns
};
if returns.is_empty() {
checker.diagnostics.push(Diagnostic::new(
InvalidBoolReturnType,
function_def.identifier(),
));
}
for stmt in returns {
if let Some(value) = stmt.value.as_deref() {
if !matches!(

View File

@@ -5,6 +5,7 @@ use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{self as ast};
use ruff_python_semantic::analyze::function_type::is_stub;
use ruff_python_semantic::analyze::terminal::Terminal;
use ruff_python_semantic::analyze::type_inference::{PythonType, ResolvedPythonType};
use ruff_text_size::Ranged;
@@ -43,7 +44,7 @@ impl Violation for InvalidBytesReturnType {
}
}
/// E0308
/// PLE0308
pub(crate) fn invalid_bytes_return(checker: &mut Checker, function_def: &ast::StmtFunctionDef) {
if function_def.name.as_str() != "__bytes__" {
return;
@@ -57,19 +58,29 @@ pub(crate) fn invalid_bytes_return(checker: &mut Checker, function_def: &ast::St
return;
}
// Determine the terminal behavior (i.e., implicit return, no return, etc.).
let terminal = Terminal::from_function(function_def);
// If every control flow path raises an exception, ignore the function.
if terminal == Terminal::Raise {
return;
}
// If there are no return statements, add a diagnostic.
if terminal == Terminal::Implicit {
checker.diagnostics.push(Diagnostic::new(
InvalidBytesReturnType,
function_def.identifier(),
));
return;
}
let returns = {
let mut visitor = ReturnStatementVisitor::default();
visitor.visit_body(&function_def.body);
visitor.returns
};
if returns.is_empty() {
checker.diagnostics.push(Diagnostic::new(
InvalidBytesReturnType,
function_def.identifier(),
));
}
for stmt in returns {
if let Some(value) = stmt.value.as_deref() {
if !matches!(

View File

@@ -0,0 +1,107 @@
use ruff_diagnostics::{Diagnostic, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::helpers::ReturnStatementVisitor;
use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{self as ast};
use ruff_python_semantic::analyze::function_type::is_stub;
use ruff_python_semantic::analyze::terminal::Terminal;
use ruff_python_semantic::analyze::type_inference::{NumberLike, PythonType, ResolvedPythonType};
use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;
/// ## What it does
/// Checks for `__hash__` implementations that return a type other than `integer`.
///
/// ## Why is this bad?
/// The `__hash__` method should return an `integer`. Returning a different
/// type may cause unexpected behavior.
///
/// Note: `bool` is a subclass of `int`, so it's technically valid for `__hash__` to
/// return `True` or `False`. However, for consistency with other rules, Ruff will
/// still raise when `__hash__` returns a `bool`.
///
/// ## Example
/// ```python
/// class Foo:
/// def __hash__(self):
/// return "2"
/// ```
///
/// Use instead:
/// ```python
/// class Foo:
/// def __hash__(self):
/// return 2
/// ```
///
///
/// ## References
/// - [Python documentation: The `__hash__` method](https://docs.python.org/3/reference/datamodel.html#object.__hash__)
#[violation]
pub struct InvalidHashReturnType;
impl Violation for InvalidHashReturnType {
#[derive_message_formats]
fn message(&self) -> String {
format!("`__hash__` does not return an integer")
}
}
/// E0309
pub(crate) fn invalid_hash_return(checker: &mut Checker, function_def: &ast::StmtFunctionDef) {
if function_def.name.as_str() != "__hash__" {
return;
}
if !checker.semantic().current_scope().kind.is_class() {
return;
}
if is_stub(function_def, checker.semantic()) {
return;
}
// Determine the terminal behavior (i.e., implicit return, no return, etc.).
let terminal = Terminal::from_function(function_def);
// If every control flow path raises an exception, ignore the function.
if terminal == Terminal::Raise {
return;
}
// If there are no return statements, add a diagnostic.
if terminal == Terminal::Implicit {
checker.diagnostics.push(Diagnostic::new(
InvalidHashReturnType,
function_def.identifier(),
));
return;
}
let returns = {
let mut visitor = ReturnStatementVisitor::default();
visitor.visit_body(&function_def.body);
visitor.returns
};
for stmt in returns {
if let Some(value) = stmt.value.as_deref() {
if !matches!(
ResolvedPythonType::from(value),
ResolvedPythonType::Unknown
| ResolvedPythonType::Atom(PythonType::Number(NumberLike::Integer))
) {
checker
.diagnostics
.push(Diagnostic::new(InvalidHashReturnType, value.range()));
}
} else {
// Disallow implicit `None`.
checker
.diagnostics
.push(Diagnostic::new(InvalidHashReturnType, stmt.range()));
}
}
}

View File

@@ -0,0 +1,109 @@
use ruff_diagnostics::{Diagnostic, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::helpers::ReturnStatementVisitor;
use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{self as ast};
use ruff_python_semantic::analyze::function_type::is_stub;
use ruff_python_semantic::analyze::terminal::Terminal;
use ruff_python_semantic::analyze::type_inference::{NumberLike, PythonType, ResolvedPythonType};
use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;
/// ## What it does
/// Checks for `__index__` implementations that return a type other than `integer`.
///
/// ## Why is this bad?
/// The `__index__` method should return an `integer`. Returning a different
/// type may cause unexpected behavior.
///
/// Note: `bool` is a subclass of `int`, so it's technically valid for `__index__` to
/// return `True` or `False`. However, a DeprecationWarning (`DeprecationWarning:
/// __index__ returned non-int (type bool)`) for such cases was already introduced,
/// thus this is a conscious difference between the original pylint rule and the
/// current ruff implementation.
///
/// ## Example
/// ```python
/// class Foo:
/// def __index__(self):
/// return "2"
/// ```
///
/// Use instead:
/// ```python
/// class Foo:
/// def __index__(self):
/// return 2
/// ```
///
///
/// ## References
/// - [Python documentation: The `__index__` method](https://docs.python.org/3/reference/datamodel.html#object.__index__)
#[violation]
pub struct InvalidIndexReturnType;
impl Violation for InvalidIndexReturnType {
#[derive_message_formats]
fn message(&self) -> String {
format!("`__index__` does not return an integer")
}
}
/// E0305
pub(crate) fn invalid_index_return(checker: &mut Checker, function_def: &ast::StmtFunctionDef) {
if function_def.name.as_str() != "__index__" {
return;
}
if !checker.semantic().current_scope().kind.is_class() {
return;
}
if is_stub(function_def, checker.semantic()) {
return;
}
// Determine the terminal behavior (i.e., implicit return, no return, etc.).
let terminal = Terminal::from_function(function_def);
// If every control flow path raises an exception, ignore the function.
if terminal == Terminal::Raise {
return;
}
// If there are no return statements, add a diagnostic.
if terminal == Terminal::Implicit {
checker.diagnostics.push(Diagnostic::new(
InvalidIndexReturnType,
function_def.identifier(),
));
return;
}
let returns = {
let mut visitor = ReturnStatementVisitor::default();
visitor.visit_body(&function_def.body);
visitor.returns
};
for stmt in returns {
if let Some(value) = stmt.value.as_deref() {
if !matches!(
ResolvedPythonType::from(value),
ResolvedPythonType::Unknown
| ResolvedPythonType::Atom(PythonType::Number(NumberLike::Integer))
) {
checker
.diagnostics
.push(Diagnostic::new(InvalidIndexReturnType, value.range()));
}
} else {
// Disallow implicit `None`.
checker
.diagnostics
.push(Diagnostic::new(InvalidIndexReturnType, stmt.range()));
}
}
}

View File

@@ -5,6 +5,7 @@ use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{self as ast, Expr};
use ruff_python_semantic::analyze::function_type::is_stub;
use ruff_python_semantic::analyze::terminal::Terminal;
use ruff_python_semantic::analyze::type_inference::{NumberLike, PythonType, ResolvedPythonType};
use ruff_text_size::Ranged;
@@ -63,19 +64,29 @@ pub(crate) fn invalid_length_return(checker: &mut Checker, function_def: &ast::S
return;
}
// Determine the terminal behavior (i.e., implicit return, no return, etc.).
let terminal = Terminal::from_function(function_def);
// If every control flow path raises an exception, ignore the function.
if terminal == Terminal::Raise {
return;
}
// If there are no return statements, add a diagnostic.
if terminal == Terminal::Implicit {
checker.diagnostics.push(Diagnostic::new(
InvalidLengthReturnType,
function_def.identifier(),
));
return;
}
let returns = {
let mut visitor = ReturnStatementVisitor::default();
visitor.visit_body(&function_def.body);
visitor.returns
};
if returns.is_empty() {
checker.diagnostics.push(Diagnostic::new(
InvalidLengthReturnType,
function_def.identifier(),
));
}
for stmt in returns {
if let Some(value) = stmt.value.as_deref() {
if is_negative_integer(value)

View File

@@ -5,6 +5,7 @@ use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{self as ast};
use ruff_python_semantic::analyze::function_type::is_stub;
use ruff_python_semantic::analyze::terminal::Terminal;
use ruff_python_semantic::analyze::type_inference::{PythonType, ResolvedPythonType};
use ruff_text_size::Ranged;
@@ -57,19 +58,29 @@ pub(crate) fn invalid_str_return(checker: &mut Checker, function_def: &ast::Stmt
return;
}
// Determine the terminal behavior (i.e., implicit return, no return, etc.).
let terminal = Terminal::from_function(function_def);
// If every control flow path raises an exception, ignore the function.
if terminal == Terminal::Raise {
return;
}
// If there are no return statements, add a diagnostic.
if terminal == Terminal::Implicit {
checker.diagnostics.push(Diagnostic::new(
InvalidStrReturnType,
function_def.identifier(),
));
return;
}
let returns = {
let mut visitor = ReturnStatementVisitor::default();
visitor.visit_body(&function_def.body);
visitor.returns
};
if returns.is_empty() {
checker.diagnostics.push(Diagnostic::new(
InvalidStrReturnType,
function_def.identifier(),
));
}
for stmt in returns {
if let Some(value) = stmt.value.as_deref() {
if !matches!(

View File

@@ -31,6 +31,8 @@ pub(crate) use invalid_bool_return::*;
pub(crate) use invalid_bytes_return::*;
pub(crate) use invalid_envvar_default::*;
pub(crate) use invalid_envvar_value::*;
pub(crate) use invalid_hash_return::*;
pub(crate) use invalid_index_return::*;
pub(crate) use invalid_length_return::*;
pub(crate) use invalid_str_return::*;
pub(crate) use invalid_string_characters::*;
@@ -131,6 +133,8 @@ mod invalid_bool_return;
mod invalid_bytes_return;
mod invalid_envvar_default;
mod invalid_envvar_value;
mod invalid_hash_return;
mod invalid_index_return;
mod invalid_length_return;
mod invalid_str_return;
mod invalid_string_characters;

View File

@@ -59,7 +59,7 @@ impl Violation for NonSlotAssignment {
}
}
/// E0237
/// PLE0237
pub(crate) fn non_slot_assignment(checker: &mut Checker, class_def: &ast::StmtClassDef) {
let semantic = checker.semantic();

View File

@@ -40,12 +40,3 @@ invalid_return_type_length.py:26:16: PLE0303 `__len__` does not return a non-neg
26 | return -42 # [invalid-length-return]
| ^^^ PLE0303
|
invalid_return_type_length.py:30:9: PLE0303 `__len__` does not return a non-negative integer
|
29 | class LengthWrongRaise:
30 | def __len__(self):
| ^^^^^^^ PLE0303
31 | print("raise some error")
32 | raise NotImplementedError # [invalid-length-return]
|

View File

@@ -0,0 +1,41 @@
---
source: crates/ruff_linter/src/rules/pylint/mod.rs
---
invalid_return_type_index.py:8:16: PLE0305 `__index__` does not return an integer
|
7 | def __index__(self):
8 | return True # [invalid-index-return]
| ^^^^ PLE0305
|
invalid_return_type_index.py:13:16: PLE0305 `__index__` does not return an integer
|
11 | class Float:
12 | def __index__(self):
13 | return 3.05 # [invalid-index-return]
| ^^^^ PLE0305
|
invalid_return_type_index.py:18:16: PLE0305 `__index__` does not return an integer
|
16 | class Dict:
17 | def __index__(self):
18 | return {"1": "1"} # [invalid-index-return]
| ^^^^^^^^^^ PLE0305
|
invalid_return_type_index.py:23:16: PLE0305 `__index__` does not return an integer
|
21 | class Str:
22 | def __index__(self):
23 | return "ruff" # [invalid-index-return]
| ^^^^^^ PLE0305
|
invalid_return_type_index.py:27:9: PLE0305 `__index__` does not return an integer
|
26 | class IndexNoReturn:
27 | def __index__(self):
| ^^^^^^^^^ PLE0305
28 | print("ruff") # [invalid-index-return]
|

View File

@@ -32,3 +32,12 @@ invalid_return_type_str.py:21:16: PLE0307 `__str__` does not return `str`
21 | return False
| ^^^^^ PLE0307
|
invalid_return_type_str.py:58:9: PLE0307 `__str__` does not return `str`
|
57 | class Str5:
58 | def __str__(self): # PLE0307 (returns None if x <= 0)
| ^^^^^^^ PLE0307
59 | if x > 0:
60 | raise RuntimeError("__str__ not allowed")
|

View File

@@ -32,12 +32,3 @@ invalid_return_type_bytes.py:20:9: PLE0308 `__bytes__` does not return `bytes`
| ^^^^^^^^^ PLE0308
21 | print("ruff") # [invalid-bytes-return]
|
invalid_return_type_bytes.py:25:9: PLE0308 `__bytes__` does not return `bytes`
|
24 | class BytesWrongRaise:
25 | def __bytes__(self):
| ^^^^^^^^^ PLE0308
26 | print("raise some error")
27 | raise NotImplementedError # [invalid-bytes-return]
|

View File

@@ -0,0 +1,34 @@
---
source: crates/ruff_linter/src/rules/pylint/mod.rs
---
invalid_return_type_hash.py:6:16: PLE0309 `__hash__` does not return an integer
|
4 | class Bool:
5 | def __hash__(self):
6 | return True # [invalid-hash-return]
| ^^^^ PLE0309
|
invalid_return_type_hash.py:11:16: PLE0309 `__hash__` does not return an integer
|
9 | class Float:
10 | def __hash__(self):
11 | return 3.05 # [invalid-hash-return]
| ^^^^ PLE0309
|
invalid_return_type_hash.py:16:16: PLE0309 `__hash__` does not return an integer
|
14 | class Str:
15 | def __hash__(self):
16 | return "ruff" # [invalid-hash-return]
| ^^^^^^ PLE0309
|
invalid_return_type_hash.py:20:9: PLE0309 `__hash__` does not return an integer
|
19 | class HashNoReturn:
20 | def __hash__(self):
| ^^^^^^^^ PLE0309
21 | print("ruff") # [invalid-hash-return]
|

View File

@@ -3,6 +3,7 @@ use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::visitor::preorder;
use ruff_python_ast::{self as ast, AnyNodeRef, Expr, Stmt};
use ruff_python_semantic::analyze::function_type::is_stub;
use crate::checkers::ast::Checker;
@@ -160,6 +161,11 @@ pub(crate) fn unused_async(
return;
}
// Ignore stubs (e.g., `...`).
if is_stub(function_def, checker.semantic()) {
return;
}
let found_await_or_async = {
let mut visitor = AsyncExprVisitor::default();
preorder::walk_body(&mut visitor, body);

View File

@@ -1,56 +1,48 @@
---
source: crates/ruff_linter/src/rules/ruff/mod.rs
---
RUF029.py:34:11: RUF029 Function `fail_1a` is declared `async`, but doesn't `await` or use `async` features.
RUF029.py:38:11: RUF029 Function `fail_1a` is declared `async`, but doesn't `await` or use `async` features.
|
34 | async def fail_1a(): # RUF029
38 | async def fail_1a(): # RUF029
| ^^^^^^^ RUF029
35 | time.sleep(1)
39 | time.sleep(1)
|
RUF029.py:38:11: RUF029 Function `fail_1b` is declared `async`, but doesn't `await` or use `async` features.
RUF029.py:42:11: RUF029 Function `fail_1b` is declared `async`, but doesn't `await` or use `async` features.
|
38 | async def fail_1b(): # RUF029: yield does not require async
42 | async def fail_1b(): # RUF029: yield does not require async
| ^^^^^^^ RUF029
39 | yield "hello"
43 | yield "hello"
|
RUF029.py:42:11: RUF029 Function `fail_2` is declared `async`, but doesn't `await` or use `async` features.
RUF029.py:46:11: RUF029 Function `fail_2` is declared `async`, but doesn't `await` or use `async` features.
|
42 | async def fail_2(): # RUF029
46 | async def fail_2(): # RUF029
| ^^^^^^ RUF029
43 | with None as i:
44 | pass
47 | with None as i:
48 | pass
|
RUF029.py:47:11: RUF029 Function `fail_3` is declared `async`, but doesn't `await` or use `async` features.
RUF029.py:51:11: RUF029 Function `fail_3` is declared `async`, but doesn't `await` or use `async` features.
|
47 | async def fail_3(): # RUF029
51 | async def fail_3(): # RUF029
| ^^^^^^ RUF029
48 | for i in []:
49 | pass
52 | for i in []:
53 | pass
|
RUF029.py:54:11: RUF029 Function `fail_4a` is declared `async`, but doesn't `await` or use `async` features.
RUF029.py:58:11: RUF029 Function `fail_4a` is declared `async`, but doesn't `await` or use `async` features.
|
54 | async def fail_4a(): # RUF029: the /outer/ function does not await
58 | async def fail_4a(): # RUF029: the /outer/ function does not await
| ^^^^^^^ RUF029
55 | async def foo():
56 | await bla
59 | async def foo():
60 | await bla
|
RUF029.py:59:11: RUF029 Function `fail_4b` is declared `async`, but doesn't `await` or use `async` features.
RUF029.py:63:11: RUF029 Function `fail_4b` is declared `async`, but doesn't `await` or use `async` features.
|
59 | async def fail_4b(): # RUF029: the /outer/ function does not await
63 | async def fail_4b(): # RUF029: the /outer/ function does not await
| ^^^^^^^ RUF029
60 | class Foo:
61 | async def foo():
|
RUF029.py:66:15: RUF029 Function `fail_4c` is declared `async`, but doesn't `await` or use `async` features.
|
65 | def foo():
66 | async def fail_4c(): # RUF029: the /inner/ function does not await
| ^^^^^^^ RUF029
67 | pass
64 | class Foo:
65 | async def foo(self):
|

View File

@@ -0,0 +1,3 @@
match subject:
case 1:
case 2: ...

View File

@@ -0,0 +1,3 @@
def foo # comment
def bar(): ...
def baz

View File

@@ -261,12 +261,59 @@ impl<'src> Parser<'src> {
}
fn node_range(&self, start: TextSize) -> TextRange {
// It's possible during error recovery that the parsing didn't consume any tokens. In that case,
// `last_token_end` still points to the end of the previous token but `start` is the start of the current token.
// Calling `TextRange::new(start, self.last_token_end)` would panic in that case because `start > end`.
// This path "detects" this case and creates an empty range instead.
if self.node_start() == start {
TextRange::empty(start)
// It's possible during error recovery that the parsing didn't consume any tokens. In that
// case, `last_token_end` still points to the end of the previous token but `start` is the
// start of the current token. Calling `TextRange::new(start, self.last_token_end)` would
// panic in that case because `start > end`. This path "detects" this case and creates an
// empty range instead.
//
// The reason it's `<=` instead of just `==` is because there could be whitespaces between
// the two tokens. For example:
//
// ```python
// # last token end
// # | current token (newline) start
// # v v
// def foo \n
// # ^
// # assume there's trailing whitespace here
// ```
//
// Or, there could tokens that are considered "trivia" and thus aren't emitted by the token
// source. These are comments and non-logical newlines. For example:
//
// ```python
// # last token end
// # v
// def foo # comment\n
// # ^ current token (newline) start
// ```
//
// In either of the above cases, there's a "gap" between the end of the last token and start
// of the current token.
if self.last_token_end <= start {
// We need to create an empty range at the last token end instead of the start because
// otherwise this node range will fall outside the range of it's parent node. Taking
// the above example:
//
// ```python
// if True:
// # function start
// # | function end
// # v v
// def foo # comment
// # ^ current token start
// ```
//
// Here, the current token start is the start of parameter range but the function ends
// at `foo`. Even if there's a function body, the range of parameters would still be
// before the comment.
// test_err node_range_with_gaps
// def foo # comment
// def bar(): ...
// def baz
TextRange::empty(self.last_token_end)
} else {
TextRange::new(start, self.last_token_end)
}

View File

@@ -1663,23 +1663,19 @@ impl<'src> Parser<'src> {
// x = 10
let type_params = self.try_parse_type_params();
// test_ok function_def_parameter_range
// def foo(
// first: int,
// second: int,
// ) -> int: ...
// test_err function_def_unclosed_parameter_list
// def foo(a: int, b:
// def foo():
// return 42
// def foo(a: int, b: str
// x = 10
let parameters_start = self.node_start();
self.expect(TokenKind::Lpar);
let mut parameters = self.parse_parameters(FunctionKind::FunctionDef);
self.expect(TokenKind::Rpar);
// test_ok function_def_parameter_range
// def foo(
// first: int,
// second: int,
// ) -> int: ...
parameters.range = self.node_range(parameters_start);
let parameters = self.parse_parameters(FunctionKind::FunctionDef);
let returns = if self.eat(TokenKind::Rarrow) {
if self.at_expr() {
@@ -2492,7 +2488,12 @@ impl<'src> Parser<'src> {
};
self.expect(TokenKind::Colon);
let body = self.parse_body(Clause::Match);
// test_err case_expect_indented_block
// match subject:
// case 1:
// case 2: ...
let body = self.parse_body(Clause::Case);
ast::MatchCase {
pattern,
@@ -2839,19 +2840,16 @@ impl<'src> Parser<'src> {
pub(super) fn parse_parameters(&mut self, function_kind: FunctionKind) -> ast::Parameters {
let start = self.node_start();
if matches!(function_kind, FunctionKind::FunctionDef) {
self.expect(TokenKind::Lpar);
}
// TODO(dhruvmanila): This has the same problem as `parse_match_pattern_mapping`
// has where if there are multiple kwarg or vararg, the last one will win and
// the parser will drop the previous ones. Another thing is the vararg and kwarg
// uses `Parameter` (not `ParameterWithDefault`) which means that the parser cannot
// recover well from `*args=(1, 2)`.
let mut parameters = ast::Parameters {
range: TextRange::default(),
posonlyargs: vec![],
args: vec![],
kwonlyargs: vec![],
vararg: None,
kwarg: None,
};
let mut parameters = ast::Parameters::empty(TextRange::default());
let mut seen_default_param = false; // `a=10`
let mut seen_positional_only_separator = false; // `/`
@@ -3089,6 +3087,10 @@ impl<'src> Parser<'src> {
self.add_error(ParseErrorType::ExpectedKeywordParam, star_range);
}
if matches!(function_kind, FunctionKind::FunctionDef) {
self.expect(TokenKind::Rpar);
}
parameters.range = self.node_range(start);
// test_err params_duplicate_names
@@ -3363,7 +3365,7 @@ enum Clause {
Class,
While,
FunctionDef,
Match,
Case,
Try,
Except,
Finally,
@@ -3380,7 +3382,7 @@ impl Display for Clause {
Clause::Class => write!(f, "`class` definition"),
Clause::While => write!(f, "`while` statement"),
Clause::FunctionDef => write!(f, "function definition"),
Clause::Match => write!(f, "`match` statement"),
Clause::Case => write!(f, "`case` block"),
Clause::Try => write!(f, "`try` statement"),
Clause::Except => write!(f, "`except` clause"),
Clause::Finally => write!(f, "`finally` clause"),

View File

@@ -0,0 +1,84 @@
---
source: crates/ruff_python_parser/tests/fixtures.rs
input_file: crates/ruff_python_parser/resources/inline/err/case_expect_indented_block.py
---
## AST
```
Module(
ModModule {
range: 0..43,
body: [
Match(
StmtMatch {
range: 0..42,
subject: Name(
ExprName {
range: 6..13,
id: "subject",
ctx: Load,
},
),
cases: [
MatchCase {
range: 19..26,
pattern: MatchValue(
PatternMatchValue {
range: 24..25,
value: NumberLiteral(
ExprNumberLiteral {
range: 24..25,
value: Int(
1,
),
},
),
},
),
guard: None,
body: [],
},
MatchCase {
range: 31..42,
pattern: MatchValue(
PatternMatchValue {
range: 36..37,
value: NumberLiteral(
ExprNumberLiteral {
range: 36..37,
value: Int(
2,
),
},
),
},
),
guard: None,
body: [
Expr(
StmtExpr {
range: 39..42,
value: EllipsisLiteral(
ExprEllipsisLiteral {
range: 39..42,
},
),
},
),
],
},
],
},
),
],
},
)
```
## Errors
|
1 | match subject:
2 | case 1:
3 | case 2: ...
| ^^^^ Syntax Error: Expected an indented block after `case` block
|

View File

@@ -0,0 +1,122 @@
---
source: crates/ruff_python_parser/tests/fixtures.rs
input_file: crates/ruff_python_parser/resources/inline/err/node_range_with_gaps.py
---
## AST
```
Module(
ModModule {
range: 0..41,
body: [
FunctionDef(
StmtFunctionDef {
range: 0..7,
is_async: false,
decorator_list: [],
name: Identifier {
id: "foo",
range: 4..7,
},
type_params: None,
parameters: Parameters {
range: 7..7,
posonlyargs: [],
args: [],
vararg: None,
kwonlyargs: [],
kwarg: None,
},
returns: None,
body: [],
},
),
FunctionDef(
StmtFunctionDef {
range: 18..32,
is_async: false,
decorator_list: [],
name: Identifier {
id: "bar",
range: 22..25,
},
type_params: None,
parameters: Parameters {
range: 25..27,
posonlyargs: [],
args: [],
vararg: None,
kwonlyargs: [],
kwarg: None,
},
returns: None,
body: [
Expr(
StmtExpr {
range: 29..32,
value: EllipsisLiteral(
ExprEllipsisLiteral {
range: 29..32,
},
),
},
),
],
},
),
FunctionDef(
StmtFunctionDef {
range: 33..40,
is_async: false,
decorator_list: [],
name: Identifier {
id: "baz",
range: 37..40,
},
type_params: None,
parameters: Parameters {
range: 40..40,
posonlyargs: [],
args: [],
vararg: None,
kwonlyargs: [],
kwarg: None,
},
returns: None,
body: [],
},
),
],
},
)
```
## Errors
|
1 | def foo # comment
| ^ Syntax Error: Expected '(', found newline
2 | def bar(): ...
3 | def baz
|
|
1 | def foo # comment
2 | def bar(): ...
| ^^^ Syntax Error: Expected ')', found 'def'
3 | def baz
|
|
1 | def foo # comment
2 | def bar(): ...
3 | def baz
| ^ Syntax Error: Expected '(', found newline
|
|
2 | def bar(): ...
3 | def baz
|

View File

@@ -167,7 +167,7 @@ Module(
conversion: None,
format_spec: Some(
FStringFormatSpec {
range: 43..43,
range: 42..42,
elements: [],
},
),

View File

@@ -7,7 +7,7 @@
1. Finally, add this to your `init.lua`:
```lua
require('lspconfig').ruff.setup()
require('lspconfig').ruff.setup {}
```
See [`nvim-lspconfig`'s server configuration guide](https://github.com/neovim/nvim-lspconfig/blob/master/doc/server_configurations.md#ruff) for more details

View File

@@ -1,6 +1,6 @@
[package]
name = "ruff_shrinking"
version = "0.4.0"
version = "0.4.1"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

View File

@@ -14,7 +14,7 @@ Ruff can be used as a [pre-commit](https://pre-commit.com) hook via [`ruff-pre-c
```yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.4.0
rev: v0.4.1
hooks:
# Run the linter.
- id: ruff
@@ -27,7 +27,7 @@ To enable lint fixes, add the `--fix` argument to the lint hook:
```yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.4.0
rev: v0.4.1
hooks:
# Run the linter.
- id: ruff
@@ -41,7 +41,7 @@ To run the hooks over Jupyter Notebooks too, add `jupyter` to the list of allowe
```yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.4.0
rev: v0.4.1
hooks:
# Run the linter.
- id: ruff

View File

@@ -4,7 +4,7 @@ build-backend = "maturin"
[project]
name = "ruff"
version = "0.4.0"
version = "0.4.1"
description = "An extremely fast Python linter and code formatter, written in Rust."
authors = [{ name = "Astral Software Inc.", email = "hey@astral.sh" }]
readme = "README.md"

2
ruff.schema.json generated
View File

@@ -3269,8 +3269,10 @@
"PLE0302",
"PLE0303",
"PLE0304",
"PLE0305",
"PLE0307",
"PLE0308",
"PLE0309",
"PLE06",
"PLE060",
"PLE0604",

View File

@@ -128,14 +128,14 @@ files = [
[[package]]
name = "dill"
version = "0.3.7"
version = "0.3.8"
description = "serialize all of Python"
category = "main"
optional = false
python-versions = ">=3.7"
files = [
{file = "dill-0.3.7-py3-none-any.whl", hash = "sha256:76b122c08ef4ce2eedcd4d1abd8e641114bfc6c2867f49f3c41facf65bf19f5e"},
{file = "dill-0.3.7.tar.gz", hash = "sha256:cc1c8b182eb3013e24bd475ff2e9295af86c1a38eb1aff128dac8962a9ce3c03"},
{file = "dill-0.3.8-py3-none-any.whl", hash = "sha256:c36ca9ffb54365bdd2f8eb3eff7d2a21237f8452b57ace88b1ac615b7e815bd7"},
{file = "dill-0.3.8.tar.gz", hash = "sha256:3ebe3c479ad625c4553aca177444d89b486b1d84982eeacded644afc0cf797ca"},
]
[package.extras]

View File

@@ -1,6 +1,6 @@
[tool.poetry]
name = "scripts"
version = "0.4.0"
version = "0.4.1"
description = ""
authors = ["Charles Marsh <charlie.r.marsh@gmail.com>"]

View File

@@ -444,7 +444,7 @@ async def main(
if matches is None:
# Handle case where there are no regex matches e.g.
# + "?application=AIRFLOW&authenticator=TEST_AUTH&role=TEST_ROLE&warehouse=TEST_WAREHOUSE" # noqa: E501, ERA001
# + "?application=AIRFLOW&authenticator=TEST_AUTH&role=TEST_ROLE&warehouse=TEST_WAREHOUSE" # noqa: E501
# Which was found in local testing
continue

239
scripts/fuzz-parser/fuzz.py Normal file
View File

@@ -0,0 +1,239 @@
"""
Run the parser on randomly generated (but syntactically valid) Python source-code files.
To install all dependencies for this script into an environment using `uv`, run:
uv pip install -r scripts/fuzz-parser/requirements.txt
Example invocations of the script:
- Run the fuzzer using seeds 0, 1, 2, 78 and 93 to generate the code:
`python scripts/fuzz-parser/fuzz.py 0-2 78 93`
- Run the fuzzer concurrently using seeds in range 0-10 inclusive,
but only reporting bugs that are new on your branch:
`python scripts/fuzz-parser/fuzz.py 0-10 --new-bugs-only`
- Run the fuzzer concurrently on 10,000 different Python source-code files,
and only print a summary at the end:
`python scripts/fuzz-parser/fuzz.py 1-10000 --quiet
N.B. The script takes a few seconds to get started, as the script needs to compile
your checked out version of ruff with `--release` as a first step before it
can actually start fuzzing.
"""
from __future__ import annotations
import argparse
import concurrent.futures
import subprocess
from dataclasses import KW_ONLY, dataclass
from typing import NewType
from pysource_codegen import generate as generate_random_code
from pysource_minimize import minimize as minimize_repro
from termcolor import colored
MinimizedSourceCode = NewType("MinimizedSourceCode", str)
Seed = NewType("Seed", int)
def run_ruff(executable_args: list[str], code: str) -> subprocess.CompletedProcess[str]:
return subprocess.run(
[*executable_args, "check", "--select=E999", "--no-cache", "-"],
capture_output=True,
text=True,
input=code,
)
def contains_bug(code: str, *, only_new_bugs: bool = False) -> bool:
"""Return True if the code triggers a parser error and False otherwise.
If `only_new_bugs` is set to `True`,
the function also runs an installed version of Ruff on the same source code,
and only returns `True` if the bug appears on the branch you have currently
checked out but *not* in the latest release.
"""
new_result = run_ruff(["cargo", "run", "--release", "--"], code)
if not only_new_bugs:
return new_result.returncode != 0
if new_result.returncode == 0:
return False
old_result = run_ruff(["ruff"], code)
return old_result.returncode == 0
@dataclass(slots=True)
class FuzzResult:
# The seed used to generate the random Python file.
# The same seed always generates the same file.
seed: Seed
# If we found a bug, this will be the minimum Python code
# required to trigger the bug. If not, it will be `None`.
maybe_bug: MinimizedSourceCode | None
def print_description(self) -> None:
"""Describe the results of fuzzing the parser with this seed."""
if self.maybe_bug:
print(colored(f"Ran fuzzer on seed {self.seed}", "red"))
print(colored("The following code triggers a bug:", "red"))
print()
print(self.maybe_bug)
print()
else:
print(colored(f"Ran fuzzer successfully on seed {self.seed}", "green"))
def fuzz_code(seed: Seed, only_new_bugs: bool) -> FuzzResult:
"""Return a `FuzzResult` instance describing the fuzzing result from this seed."""
code = generate_random_code(seed)
if contains_bug(code, only_new_bugs=only_new_bugs):
try:
new_code = minimize_repro(code, contains_bug)
except ValueError:
# `pysource_minimize.minimize()` sometimes raises `ValueError` internally.
# Just ignore it if so, and use the original generated code;
# minimizing the repro is a nice-to-have, but isn't crucial.
new_code = code
return FuzzResult(seed, MinimizedSourceCode(new_code))
return FuzzResult(seed, None)
def run_fuzzer_concurrently(args: ResolvedCliArgs) -> list[FuzzResult]:
print(
f"Concurrently running the fuzzer on "
f"{len(args.seeds)} randomly generated source-code files..."
)
bugs: list[FuzzResult] = []
with concurrent.futures.ProcessPoolExecutor() as executor:
fuzz_result_futures = [
executor.submit(fuzz_code, seed, args.only_new_bugs) for seed in args.seeds
]
try:
for future in concurrent.futures.as_completed(fuzz_result_futures):
fuzz_result = future.result()
if not args.quiet:
fuzz_result.print_description()
if fuzz_result.maybe_bug:
bugs.append(fuzz_result)
except KeyboardInterrupt:
print("\nShutting down the ProcessPoolExecutor due to KeyboardInterrupt...")
print("(This might take a few seconds)")
executor.shutdown(cancel_futures=True)
raise
return bugs
def run_fuzzer_sequentially(args: ResolvedCliArgs) -> list[FuzzResult]:
print(
f"Sequentially running the fuzzer on "
f"{len(args.seeds)} randomly generated source-code files..."
)
bugs: list[FuzzResult] = []
for seed in args.seeds:
fuzz_result = fuzz_code(seed, only_new_bugs=args.only_new_bugs)
if not args.quiet:
fuzz_result.print_description()
if fuzz_result.maybe_bug:
bugs.append(fuzz_result)
return bugs
def main(args: ResolvedCliArgs) -> None:
if args.only_new_bugs:
ruff_version = (
subprocess.run(
["ruff", "--version"], text=True, capture_output=True, check=True
)
.stdout.strip()
.split(" ")[1]
)
print(
f"As you have selected `--only-new-bugs`, "
f"bugs will only be reported if they appear on your current branch "
f"but do *not* appear in `ruff=={ruff_version}`"
)
if len(args.seeds) <= 5:
bugs = run_fuzzer_sequentially(args)
else:
bugs = run_fuzzer_concurrently(args)
noun_phrase = "New bugs" if args.only_new_bugs else "Bugs"
if bugs:
print(colored(f"{noun_phrase} found in the following seeds:", "red"))
print(*sorted(bug.seed for bug in bugs))
else:
print(colored(f"No {noun_phrase.lower()} found!", "green"))
def parse_seed_argument(arg: str) -> int | range:
"""Helper for argument parsing"""
if "-" in arg:
start, end = map(int, arg.split("-"))
if end <= start:
raise argparse.ArgumentTypeError(
f"Error when parsing seed argument {arg!r}: "
f"range end must be > range start"
)
seed_range = range(start, end + 1)
range_too_long = (
f"Error when parsing seed argument {arg!r}: "
f"maximum allowed range length is 1_000_000_000"
)
try:
if len(seed_range) > 1_000_000_000:
raise argparse.ArgumentTypeError(range_too_long)
except OverflowError:
raise argparse.ArgumentTypeError(range_too_long) from None
return range(int(start), int(end) + 1)
return int(arg)
@dataclass(slots=True)
class ResolvedCliArgs:
seeds: list[Seed]
_: KW_ONLY
only_new_bugs: bool
quiet: bool
def parse_args() -> ResolvedCliArgs:
"""Parse command-line arguments"""
parser = argparse.ArgumentParser(
description=__doc__, formatter_class=argparse.RawTextHelpFormatter
)
parser.add_argument(
"seeds",
type=parse_seed_argument,
nargs="+",
help="Either a single seed, or an inclusive range of seeds in the format `0-5`",
)
parser.add_argument(
"--only-new-bugs",
action="store_true",
help=(
"Only report bugs if they exist on the current branch, "
"but *didn't* exist on the released version of Ruff "
"installed into the Python environment we're running in"
),
)
parser.add_argument(
"--quiet",
action="store_true",
help="Print fewer things to the terminal while running the fuzzer",
)
args = parser.parse_args()
seed_arguments: list[range | int] = args.seeds
seen_seeds: set[int] = set()
for arg in seed_arguments:
if isinstance(arg, int):
seen_seeds.add(arg)
else:
seen_seeds.update(arg)
return ResolvedCliArgs(
sorted(map(Seed, seen_seeds)),
only_new_bugs=args.only_new_bugs,
quiet=args.quiet,
)
if __name__ == "__main__":
args = parse_args()
main(args)

View File

@@ -0,0 +1,4 @@
pysource-codegen
pysource-minimize
ruff
termcolor

View File

@@ -0,0 +1,28 @@
# This file was autogenerated by uv via the following command:
# uv pip compile scripts/fuzz-parser/requirements.in --output-file scripts/fuzz-parser/requirements.txt
asttokens==2.4.1
# via pysource-minimize
astunparse==1.6.3
# via pysource-minimize
click==8.1.7
# via pysource-minimize
markdown-it-py==3.0.0
# via rich
mdurl==0.1.2
# via markdown-it-py
pygments==2.17.2
# via rich
pysource-codegen==0.5.1
pysource-minimize==0.6.2
rich==13.7.1
# via pysource-minimize
ruff==0.4.0
six==1.16.0
# via
# asttokens
# astunparse
termcolor==2.4.0
typing-extensions==4.11.0
# via pysource-codegen
wheel==0.43.0
# via astunparse

View File

@@ -11,17 +11,19 @@ line-length = 88
line-length = 88
[tool.ruff.lint]
select = ["ALL"]
ignore = [
"C901", # McCabe complexity
"D", # pydocstyle
"PL", # pylint
"S", # bandit
"G", # flake8-logging
"T", # flake8-print
"FBT", # flake8-boolean-trap
"PERF", # perflint
"ANN401",
select = [
"E", # pycodestyle (error)
"F", # pyflakes
"B", # bugbear
"B9",
"C4", # flake8-comprehensions
"SIM", # flake8-simplify
"I", # isort
"UP", # pyupgrade
"PIE", # flake8-pie
"PGH", # pygrep-hooks
"PYI", # flake8-pyi
"RUF",
]
[tool.ruff.lint.isort]