This PR adjusts the logic for skipping formatting so that a `fmt: skip`
can affect multiple statements if they lie on the same line.
Specifically, a `fmt: skip` comment will now suppress all the statements
in the suite in which it appears whose range intersects the line
containing the skip directive. For example:
```python
x=[
'1'
];x=2 # fmt: skip
```
remains unchanged after formatting.
(Note that compound statements are somewhat special and were handled in
a previous PR - see #20633).
Closes#17331 and #11430.
Simplest to review commit by commit - the key diffs of interest are the
commit introducing the core logic, and the diff between the snapshots
introduced in the last commit (compared to the second commit).
# Implementation
On `main` we format a suite of statements by iterating through them. If
we meet a statement with a leading or trailing (own-line)`fmt: off`
comment, then we suppress formatting until we meet a `fmt: on` comment.
Otherwise we format the statement using its own formatting rule.
How are `fmt: skip` comments handled then? They are handled internally
to the formatting of each statement. Specifically, calling `.fmt` on a
statement node will first check to see if there is a trailing,
end-of-line `fmt: skip` (or `fmt: off`/`yapf: off`), and if so then
write the node with suppressed formatting.
In this PR we move the responsibility for handling `fmt: skip` into the
formatting logic of the suite itself. This is done as follows:
- Before beginning to format the suite, we do a pass through the
statements and collect the data of ranges with skipped formatting. More
specifically, we create a map with key given by the _first_ skipped
statement in a block and value a pair consisting of the _last_ skipped
statement and the _range_ to write verbatim.
- We iterate as before, but if we meet a statement that is a key in the
map constructed above, we pause to write the associated range verbatim.
We then advance the iterator to the last statement in the block and
proceed as before.
## Addendum on range formatting
We also had to make some changes to range formatting in order to support
this new behavior. For example, we want to make sure that
```python
<RANGE_START>x=1<RANGE_END>;x=2 # fmt: skip
```
formats verbatim, rather than becoming
```python
x = 1;x=2 # fmt: skip
```
Recall that range formatting proceeds in two steps:
1. Find the smallest enclosing node containing the range AND that has
enough info to format the range (so it may be larger than you think,
e.g. a docstring has enclosing node given by the suite, not the string
itself.)
2. Carve out the formatted range from the result of formatting that
enclosing node.
We had to modify (1), since the suite knows how to format skipped nodes,
but nodes may not "know" they are skipped. To do this we altered the
`visit_body` bethod of the `FindEnclosingNode` visitor: now we iterate
through the statements and check for skipped ranges intersecting the
format range. If we find them, we return without descending. The result
is to consider the statement containing the suite as the enclosing node
in this case.
While looking into potential AST optimizations, I noticed the `AstNode`
trait and `AnyNode` type aren't used anywhere in Ruff or Red Knot. It
looks like they might be historical artifacts of previous ways of
consuming AST nodes?
- `AstNode::cast`, `AstNode::cast_ref`, and `AstNode::can_cast` are not
used anywhere.
- Since `cast_ref` isn't needed anymore, the `Ref` associated type isn't
either.
This is a pure refactoring, with no intended behavior changes.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
Fixes#6611
## Summary
This lint rule spots comments that are _intended_ to suppress or enable
the formatter, but will be ignored by the Ruff formatter.
We borrow some functions the formatter uses for determining comment
placement / putting them in context within an AST.
The analysis function uses an AST visitor to visit each comment and
attach it to the AST. It then uses that context to check:
1. Is this comment in an expression?
2. Does this comment have bad placement? (e.g. a `# fmt: skip` above a
function instead of at the end of a line)
3. Is this comment redundant?
4. Does this comment actually suppress any code?
5. Does this comment have ambiguous placement? (e.g. a `# fmt: off`
above an `else:` block)
If any of these are true, a violation is thrown. The reported reason
depends on the order of the above check-list: in other words, a `# fmt:
skip` comment on its own line within a list expression will be reported
as being in an expression, since that reason takes priority.
The lint suggests removing the comment as an unsafe fix, regardless of
the reason.
## Test Plan
A snapshot test has been created.
## Summary
Builds on #6170 to break `global` and `nonlocal` statements, such that
we get:
```python
def f():
global \
analyze_featuremap_layer, \
analyze_featuremapcompression_layer, \
analyze_latencies_post, \
analyze_motions_layer, \
analyze_size_model
```
Instead of:
```python
def f():
global analyze_featuremap_layer, analyze_featuremapcompression_layer, analyze_latencies_post, analyze_motions_layer, analyze_size_model
```
Notably, we avoid applying this formatting if the statement ends in a
comment. Otherwise, the comment would _need_ to be placed after the last
item, like:
```python
def f():
global \
analyze_featuremap_layer, \
analyze_featuremapcompression_layer, \
analyze_latencies_post, \
analyze_motions_layer, \
analyze_size_model # noqa
```
To me, this seems wrong (and would break the `# noqa` comment). Ideally,
the items would be parenthesized, and the comment would be on the inner
parenthesis, like:
```python
def f():
global ( # noqa
analyze_featuremap_layer,
analyze_featuremapcompression_layer,
analyze_latencies_post,
analyze_motions_layer,
analyze_size_model
)
```
But that's not valid syntax.
## Summary
Adds `global` and `nonlocal` formatting, without the "deviation from
black" outlined in the linked issue, which I'll do separately.
See: https://github.com/astral-sh/ruff/issues/4798.
## Test Plan
Added a fixture in the Ruff-specific directory since the Black fixtures
don't seem to cover this.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes.
The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:
```python
def func_no_args():
a; b; c
if True: raise RuntimeError
if False: ...
for i in range(10):
print(i)
continue
```
Get's reformatted to
```python
def func_no_args():
a; b; c
if True: raise RuntimeError
if False: ...
for i in range(10):
print(i)
continue
```
because our formatter does not yet support `for` statements and just inserts the text from the source.
## Downsides
Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid.
## Upsides
Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)
## Test Plan
None?