Compare commits

...

39 Commits

Author SHA1 Message Date
Dhruv Manilawala
b84feeb168 Remove inner abstraction layer from string nodes 2024-02-18 17:35:03 +05:30
Charlie Marsh
235cfb7976 Bump version to v0.2.2 (#10018) 2024-02-17 22:15:04 +00:00
Dhruv Manilawala
91ae81b565 Move RUF001, RUF002 to AST checker (#9993)
## Summary

Part of #7595 

This PR moves the `RUF001` and `RUF002` rules to the AST checker. This
removes the use of docstring detection from these rules.

## Test Plan

As this is just a refactor, make sure existing test cases pass.
2024-02-17 17:01:31 +00:00
Adam Kuhn
d46c5d8ac8 docs: Formatter compatibility warning for D207 and D300 (#10007)
- Update docs to mention formatter compatibility interactions for
under-indentation (D207) and triple-single-quotes (D300)
- Changes verified locally with mkdocs
- Closes: https://github.com/astral-sh/ruff/issues/9675
2024-02-17 07:37:38 -05:00
Jane Lewis
20217e9bbd Fix panic on RUF027 (#9990)
## Summary

Fixes #9895 

The cause for this panic came from an offset error in the code. When
analyzing a hypothetical f-string, we attempt to re-parse it as an
f-string, and use the AST data to determine, among other things, whether
the format specifiers are correct. To determine the 'correctness' of a
format specifier, we actually have to re-parse the format specifier, and
this is where the issue lies. To get the source text for the specifier,
we were taking a slice from the original file source text... even though
the AST data for the specifier belongs to the standalone parsed f-string
expression, meaning that the ranges are going to be way off. In a file
with Unicode, this can cause panics if the slice is inside a char
boundary.

To fix this, we now slice from the temporary source we created earlier
to parse the literal as an f-string.

## Test Plan

The RUF027 snapshot test was amended to include a string with format
specifiers which we _should_ be calling out. This is to ensure we do
slice format specifiers from the source text correctly.
2024-02-16 20:04:39 +00:00
Dhruv Manilawala
72bf1c2880 Preview minimal f-string formatting (#9642)
## Summary

_This is preview only feature and is available using the `--preview`
command-line flag._

With the implementation of [PEP 701] in Python 3.12, f-strings can now
be broken into multiple lines, can contain comments, and can re-use the
same quote character. Currently, no other Python formatter formats the
f-strings so there's some discussion which needs to happen in defining
the style used for f-string formatting. Relevant discussion:
https://github.com/astral-sh/ruff/discussions/9785

The goal for this PR is to add minimal support for f-string formatting.
This would be to format expression within the replacement field without
introducing any major style changes.

### Newlines

The heuristics for adding newline is similar to that of
[Prettier](https://prettier.io/docs/en/next/rationale.html#template-literals)
where the formatter would only split an expression in the replacement
field across multiple lines if there was already a line break within the
replacement field.

In other words, the formatter would not add any newlines unless they
were already present i.e., they were added by the user. This makes
breaking any expression inside an f-string optional and in control of
the user. For example,

```python
# We wouldn't break this
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"

# But, we would break the following as there's already a newline
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
	aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
```


If there are comments in any of the replacement field of the f-string,
then it will always be a multi-line f-string in which case the formatter
would prefer to break expressions i.e., introduce newlines. For example,

```python
x = f"{ # comment
    a }"
```

### Quotes

The logic for formatting quotes remains unchanged. The existing logic is
used to determine the necessary quote char and is used accordingly.

Now, if the expression inside an f-string is itself a string like, then
we need to make sure to preserve the existing quote and not change it to
the preferred quote unless it's 3.12. For example,

```python
f"outer {'inner'} outer"

# For pre 3.12, preserve the single quote
f"outer {'inner'} outer"

# While for 3.12 and later, the quotes can be changed
f"outer {"inner"} outer"
```

But, for triple-quoted strings, we can re-use the same quote char unless
the inner string is itself a triple-quoted string.

```python
f"""outer {"inner"} outer"""  # valid
f"""outer {'''inner'''} outer"""  # preserve the single quote char for the inner string
```

### Debug expressions

If debug expressions are present in the replacement field of a f-string,
then the whitespace needs to be preserved as they will be rendered as it
is (for example, `f"{ x = }"`. If there are any nested f-strings, then
the whitespace in them needs to be preserved as well which means that
we'll stop formatting the f-string as soon as we encounter a debug
expression.

```python
f"outer {   x =  !s  :.3f}"
#                  ^^
#                  We can remove these whitespaces
```

Now, the whitespace doesn't need to be preserved around conversion spec
and format specifiers, so we'll format them as usual but we won't be
formatting any nested f-string within the format specifier.

### Miscellaneous

- The
[`hug_parens_with_braces_and_square_brackets`](https://github.com/astral-sh/ruff/issues/8279)
preview style isn't implemented w.r.t. the f-string curly braces.
- The
[indentation](https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590)
is always relative to the f-string containing statement

## Test Plan

* Add new test cases
* Review existing snapshot changes
* Review the ecosystem changes

[PEP 701]: https://peps.python.org/pep-0701/
2024-02-16 20:28:11 +05:30
Jacob Coffee
c47ff658e4 chore(docs): update Discord invite to permalink (#10005)
## Summary

Update the Discord unique-id invite links to use the company permalink.

## Test Plan

Visiting the links
2024-02-15 23:16:02 -05:00
Adrien Ball
c3bba54b6b Fix SIM113 false positive with async for loops (#9996)
## Summary
Ignore `async for` loops when checking the SIM113 rule.

Closes #9995 

## Test Plan
A new test case was added to SIM113.py with an async for loop.
2024-02-15 22:40:01 -05:00
Micha Reiser
fe79798c12 split string module (#9987) 2024-02-14 18:54:55 +01:00
Micha Reiser
bb8d2034e2 Use atomic write when persisting cache (#9981) 2024-02-14 15:09:21 +01:00
Charlie Marsh
f40e012b4e Use name directly in RUF006 (#9979) 2024-02-14 00:00:47 +00:00
Asger Hautop Drewsen
3e9d761b13 Expand asyncio-dangling-task (RUF006) to include new_event_loop (#9976)
## Summary

Fixes #9974

## Test Plan

I added some new test cases.
2024-02-13 18:28:06 +00:00
Micha Reiser
46db3f96ac Add example demonstrating that fmt: skip on expression level is not supported (#9973) 2024-02-13 15:35:27 +00:00
Dhruv Manilawala
6f9c128d77 Separate StringNormalizer from StringPart (#9954)
## Summary

This PR is a small refactor to extract out the logic for normalizing
string in the formatter from the `StringPart` struct. It also separates
the quote selection into a separate method on the new
`StringNormalizer`. Both of these will help in the f-string formatting
to use `StringPart` and `choose_quotes` irrespective of normalization.

The reason for having separate quote selection and normalization step is
so that the f-string formatting can perform quote selection on its own.

Unlike string and byte literals, the f-string formatting would require
that the normalization happens only for the literal elements of it i.e.,
the "foo" and "bar" in `f"foo {x + y} bar"`. This will automatically be
handled by the already separate `normalize_string` function.

Another use-case in the f-string formatting is to extract out the
relevant information from the `StringPart` like quotes and prefix which
is to be passed as context while formatting each element of an f-string.

## Test Plan

Ensure that clippy is happy and all tests pass.
2024-02-13 18:14:56 +05:30
Micha Reiser
6380c90031 Run isort CRLF tests (#9970) 2024-02-13 09:25:22 +01:00
Charlie Marsh
d96a0dbe57 Respect tuple assignments in typing analyzer (#9969)
## Summary

Just addressing some discrepancies between the analyzers like `is_dict`
and the logic that's matured in `find_binding_value`.
2024-02-13 05:02:52 +00:00
Dhruv Manilawala
180920fdd9 Make semantic model aware of docstring (#9960)
## Summary

This PR introduces a new semantic model flag `DOCSTRING` which suggests
that the model is currently in a module / class / function docstring.
This is the first step in eliminating the docstring detection state
machine which is prone to bugs as stated in #7595.

## Test Plan

~TODO: Is there a way to add a test case for this?~

I tested this using the following code snippet and adding a print
statement in the `string_like` analyzer to print if we're currently in a
docstring or not.

<details><summary>Test code snippet:</summary>
<p>

```python
"Docstring" ", still a docstring"
"Not a docstring"


def foo():
    "Docstring"
    "Not a docstring"
    if foo:
        "Not a docstring"
        pass


class Foo:
    "Docstring"
    "Not a docstring"

    foo: int
    "Unofficial variable docstring"

    def method():
        "Docstring"
        "Not a docstring"
        pass


def bar():
    "Not a docstring".strip()


def baz():
    _something_else = 1
    """Not a docstring"""
```

</p>
</details>
2024-02-13 04:26:08 +00:00
konsti
1ccd8354c1 Don't forget to set your cpu to performance mode (#9700)
Since i just spent quite some time wondering why my benchmarks were the
opposite of what they should be, a reminder to check your cpu governor.
Setting mine to perf mode was crucial.
2024-02-13 03:36:11 +00:00
Aleksei Latyshev
dd0ba16a79 [refurb] Implement readlines_in_for lint (FURB129) (#9880)
## Summary
Implement [implicit readlines
(FURB129)](https://github.com/dosisod/refurb/blob/master/refurb/checks/iterable/implicit_readlines.py)
lint.

## Notes
I need a help/an opinion about suggested implementations.

This implementation differs from the original one from `refurb` in the
following way. This implementation checks syntactically the call of the
method with the name `readlines()` inside `for` {loop|generator
expression}. The implementation from refurb also
[checks](https://github.com/dosisod/refurb/blob/master/refurb/checks/iterable/implicit_readlines.py#L43)
that callee is a variable with a type `io.TextIOWrapper` or
`io.BufferedReader`.

- I do not see a simple way to implement the same logic.
- The best I can have is something like
```rust
checker.semantic().binding(checker.semantic().resolve_name(attr_expr.value.as_name_expr()?)?).statement(checker.semantic())
```
and analyze cases. But this will be not about types, but about guessing
the type by assignment (or with) expression.
- Also this logic has several false negatives, when the callee is not a
variable, but the result of function call (e.g. `open(...)`).
- On the other side, maybe it is good to lint this on other things,
where this suggestion is not safe, and push the developers to change
their interfaces to be less surprising, comparing with the standard
library.
- Anyway while the current implementation has false-positives (I
mentioned some of them in the test) I marked the fixes to be unsafe.
2024-02-12 22:28:35 -05:00
Charlie Marsh
609d0a9a65 Remove symbol from type-matching API (#9968)
## Summary

These should be no-op refactors to remove some redundant data from the
type analysis APIs.
2024-02-12 20:57:19 -05:00
Auguste Lalande
8fba97f72f PLR2004: Accept 0.0 and 1.0 as common magic values (#9964)
## Summary

Accept 0.0 and 1.0 as common magic values. This is in line with the
pylint behaviour, and I think makes sense conceptually.


## Test Plan

Test cases were added to
`crates/ruff_linter/resources/test/fixtures/pylint/magic_value_comparison.py`
2024-02-13 01:21:06 +00:00
Charlie Marsh
5bc0d9c324 Add a binding kind for comprehension targets (#9967)
## Summary

I was surprised to learn that we treat `x` in `[_ for x in y]` as an
"assignment" binding kind, rather than a dedicated comprehension
variable.
2024-02-12 20:09:39 -05:00
Hashem
cf77eeb913 unused_imports/F401: Explain when imports are preserved (#9963)
The docs previously mentioned an irrelevant config option, but were
missing a link to the relevant `ignore-init-module-imports` config
option which _is_ actually used.

Additionally, this commit adds a link to the documentation to explain
the conventions around a module interface which includes using a
redundant import alias to preserve an unused import.

(noticed this while filing  #9962)
2024-02-12 19:07:20 -05:00
Dhruv Manilawala
3f4dd01e7a Rename semantic model flag to MODULE_DOCSTRING_BOUNDARY (#9959)
## Summary

This PR renames the semantic model flag `MODULE_DOCSTRING` to
`MODULE_DOCSTRING_BOUNDARY`. The main reason is for readability and for
the new semantic model flag `DOCSTRING` which tracks that the model is
in a module / class / function docstring.

I got confused earlier with the name until I looked at the use case and
it seems that the `_BOUNDARY` prefix is more appropriate for the
use-case and is consistent with other flags.
2024-02-13 00:47:12 +05:30
Micha Reiser
edfe8421ec Disable top-level docstring formatting for notebooks (#9957) 2024-02-12 18:14:02 +00:00
Charlie Marsh
ab2253db03 [pylint] Avoid suggesting set rewrites for non-hashable types (#9956)
## Summary

Ensures that `x in [y, z]` does not trigger in `x`, `y`, or `z` are
known _not_ to be hashable.

Closes https://github.com/astral-sh/ruff/issues/9928.
2024-02-12 13:05:54 -05:00
Dhruv Manilawala
33ac2867b7 Use non-parenthesized range for DebugText (#9953)
## Summary

This PR fixes the `DebugText` implementation to use the expression range
instead of the parenthesized range.

Taking the following code snippet as an example:
```python
x = 1
print(f"{  ( x  ) = }")
```

The output of running it would be:
```
  ( x  ) = 1
```

Notice that the whitespace between the parentheses and the expression is
preserved as is.

Currently, we don't preserve this information in the AST which defeats
the purpose of `DebugText` as the main purpose of the struct is to
preserve whitespaces _around_ the expression.

This is also problematic when generating the code from the AST node as
then the generator has no information about the parentheses the
whitespaces between them and the expression which would lead to the
removal of the parentheses in the generated code.

I noticed this while working on the f-string formatting where the debug
text would be used to preserve the text surrounding the expression in
the presence of debug expression. The parentheses were being dropped
then which made me realize that the problem is instead in the parser.

## Test Plan

1. Add a test case for the parser
2. Add a test case for the generator
2024-02-12 23:00:02 +05:30
Charlie Marsh
0304623878 [perflint] Catch a wider range of mutations in PERF101 (#9955)
## Summary

This PR ensures that if a list `x` is modified within a `for` loop, we
avoid flagging `list(x)` as unnecessary. Previously, we only detected
calls to exactly `.append`, and they couldn't be nested within other
statements.

Closes https://github.com/astral-sh/ruff/issues/9925.
2024-02-12 12:17:55 -05:00
Charlie Marsh
e2785f3fb6 [flake8-pyi] Ignore 'unused' private type dicts in class scopes (#9952)
## Summary

If these are defined within class scopes, they're actually attributes of
the class, and can be accessed through the class itself.

(We preserve our existing behavior for `.pyi` files.)

Closes https://github.com/astral-sh/ruff/issues/9948.
2024-02-12 17:06:20 +00:00
dependabot[bot]
90f8e4baf4 Bump the actions group with 1 update (#9943) 2024-02-12 12:05:31 -05:00
Micha Reiser
8657a392ff Docstring formatting: Preserve tab indentation when using indent-style=tabs (#9915) 2024-02-12 16:09:13 +01:00
Micha Reiser
4946a1876f Stabilize quote-style preserve (#9922) 2024-02-12 09:30:07 +00:00
dependabot[bot]
6dc1b21917 Bump indicatif from 0.17.7 to 0.17.8 (#9942)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-12 10:25:47 +01:00
dependabot[bot]
2e1160e74c Bump thiserror from 1.0.56 to 1.0.57 (#9941)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-12 10:24:40 +01:00
dependabot[bot]
37ff436e4e Bump chrono from 0.4.33 to 0.4.34 (#9940)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-12 10:24:16 +01:00
Micha Reiser
341c2698a7 Run doctests as part of CI pipeline (#9939) 2024-02-12 10:18:58 +01:00
Owen Lamont
a50e2787df Fixed nextest install line in CONTRIBUTING.md (#9929)
## Summary

I noticed the example line in CONTRIBUTING.md:

```shell
cargo install nextest
```

Didn't appear to install the intended package cargo-nextest.


![nextest](https://github.com/astral-sh/ruff/assets/12672027/7bbdd9c3-c35a-464a-b586-3e9f777f8373)

So I checked what it [should
be](https://nexte.st/book/installing-from-source.html) and replaced the
line:

```shell
cargo install cargo-nextest --locked
```

## Test Plan

Just checked the cargo install appeared to give sane looking results

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-02-11 15:22:17 +00:00
wzy
25868d0371 docs: add mdformat-ruff to integrations.md (#9924)
Can [mdformat-ruff](https://github.com/Freed-Wu/mdformat-ruff) be hosted
in <https://github.com/astral-sh> like other integrations of ruff? TIA!
2024-02-11 03:39:15 +00:00
Charlie Marsh
af2cba7c0a Migrate to nextest (#9921)
## Summary

We've had success with `nextest` in other projects, so lets migrate
Ruff.

The Linux tests look a little bit faster (from 2m32s down to 2m8s), the
Windows tests look a little bit slower but not dramatically so.
2024-02-10 18:58:56 +00:00
126 changed files with 5362 additions and 1395 deletions

8
.config/nextest.toml Normal file
View File

@@ -0,0 +1,8 @@
[profile.ci]
# Print out output for failing tests as soon as they fail, and also at the end
# of the run (for easy scrollability).
failure-output = "immediate-final"
# Do not cancel the test run on the first failure.
fail-fast = false
status-level = "skip"

View File

@@ -111,13 +111,23 @@ jobs:
- uses: actions/checkout@v4
- name: "Install Rust toolchain"
run: rustup show
- name: "Install mold"
uses: rui314/setup-mold@v1
- name: "Install cargo nextest"
uses: taiki-e/install-action@v2
with:
tool: cargo-nextest
- name: "Install cargo insta"
uses: taiki-e/install-action@v2
with:
tool: cargo-insta
- uses: Swatinem/rust-cache@v2
- name: "Run tests"
run: cargo insta test --all --all-features --unreferenced reject
shell: bash
env:
NEXTEST_PROFILE: "ci"
run: cargo insta test --all-features --unreferenced reject --test-runner nextest
# Check for broken links in the documentation.
- run: cargo doc --all --no-deps
env:
@@ -138,15 +148,16 @@ jobs:
- uses: actions/checkout@v4
- name: "Install Rust toolchain"
run: rustup show
- name: "Install cargo insta"
- name: "Install cargo nextest"
uses: taiki-e/install-action@v2
with:
tool: cargo-insta
tool: cargo-nextest
- uses: Swatinem/rust-cache@v2
- name: "Run tests"
shell: bash
# We can't reject unreferenced snapshots on windows because flake8_executable can't run on windows
run: cargo insta test --all --exclude ruff_dev --all-features
run: |
cargo nextest run --all-features --profile ci
cargo test --all-features --doc
cargo-test-wasm:
name: "cargo test (wasm)"
@@ -407,7 +418,7 @@ jobs:
- uses: actions/setup-python@v5
- name: "Add SSH key"
if: ${{ env.MKDOCS_INSIDERS_SSH_KEY_EXISTS == 'true' }}
uses: webfactory/ssh-agent@v0.8.0
uses: webfactory/ssh-agent@v0.9.0
with:
ssh-private-key: ${{ secrets.MKDOCS_INSIDERS_SSH_KEY }}
- name: "Install Rust toolchain"

View File

@@ -23,7 +23,7 @@ jobs:
- uses: actions/setup-python@v5
- name: "Add SSH key"
if: ${{ env.MKDOCS_INSIDERS_SSH_KEY_EXISTS == 'true' }}
uses: webfactory/ssh-agent@v0.8.0
uses: webfactory/ssh-agent@v0.9.0
with:
ssh-private-key: ${{ secrets.MKDOCS_INSIDERS_SSH_KEY }}
- name: "Install Rust toolchain"

View File

@@ -1,5 +1,66 @@
# Changelog
## 0.2.2
Highlights include:
- Initial support formatting f-strings (in `--preview`).
- Support for overriding arbitrary configuration options via the CLI through an expanded `--config`
argument (e.g., `--config "lint.isort.combine-as-imports=false"`).
- Significant performance improvements in Ruff's lexer, parser, and lint rules.
### Preview features
- Implement minimal f-string formatting ([#9642](https://github.com/astral-sh/ruff/pull/9642))
- \[`pycodestyle`\] Add blank line(s) rules (`E301`, `E302`, `E303`, `E304`, `E305`, `E306`) ([#9266](https://github.com/astral-sh/ruff/pull/9266))
- \[`refurb`\] Implement `readlines_in_for` (`FURB129`) ([#9880](https://github.com/astral-sh/ruff/pull/9880))
### Rule changes
- \[`ruff`\] Ensure closing parentheses for multiline sequences are always on their own line (`RUF022`, `RUF023`) ([#9793](https://github.com/astral-sh/ruff/pull/9793))
- \[`numpy`\] Add missing deprecation violations (`NPY002`) ([#9862](https://github.com/astral-sh/ruff/pull/9862))
- \[`flake8-bandit`\] Detect `mark_safe` usages in decorators ([#9887](https://github.com/astral-sh/ruff/pull/9887))
- \[`ruff`\] Expand `asyncio-dangling-task` (`RUF006`) to include `new_event_loop` ([#9976](https://github.com/astral-sh/ruff/pull/9976))
- \[`flake8-pyi`\] Ignore 'unused' private type dicts in class scopes ([#9952](https://github.com/astral-sh/ruff/pull/9952))
### Formatter
- Docstring formatting: Preserve tab indentation when using `indent-style=tabs` ([#9915](https://github.com/astral-sh/ruff/pull/9915))
- Disable top-level docstring formatting for notebooks ([#9957](https://github.com/astral-sh/ruff/pull/9957))
- Stabilize quote-style's `preserve` mode ([#9922](https://github.com/astral-sh/ruff/pull/9922))
### CLI
- Allow arbitrary configuration options to be overridden via the CLI ([#9599](https://github.com/astral-sh/ruff/pull/9599))
### Bug fixes
- Make `show-settings` filters directory-agnostic ([#9866](https://github.com/astral-sh/ruff/pull/9866))
- Respect duplicates when rewriting type aliases ([#9905](https://github.com/astral-sh/ruff/pull/9905))
- Respect tuple assignments in typing analyzer ([#9969](https://github.com/astral-sh/ruff/pull/9969))
- Use atomic write when persisting cache ([#9981](https://github.com/astral-sh/ruff/pull/9981))
- Use non-parenthesized range for `DebugText` ([#9953](https://github.com/astral-sh/ruff/pull/9953))
- \[`flake8-simplify`\] Avoid false positive with `async` for loops (`SIM113`) ([#9996](https://github.com/astral-sh/ruff/pull/9996))
- \[`flake8-trio`\] Respect `async with` in `timeout-without-await` ([#9859](https://github.com/astral-sh/ruff/pull/9859))
- \[`perflint`\] Catch a wider range of mutations in `PERF101` ([#9955](https://github.com/astral-sh/ruff/pull/9955))
- \[`pycodestyle`\] Fix `E30X` panics on blank lines with trailing white spaces ([#9907](https://github.com/astral-sh/ruff/pull/9907))
- \[`pydocstyle`\] Allow using `parameters` as a subsection header (`D405`) ([#9894](https://github.com/astral-sh/ruff/pull/9894))
- \[`pydocstyle`\] Fix blank-line docstring rules for module-level docstrings ([#9878](https://github.com/astral-sh/ruff/pull/9878))
- \[`pylint`\] Accept 0.0 and 1.0 as common magic values (`PLR2004`) ([#9964](https://github.com/astral-sh/ruff/pull/9964))
- \[`pylint`\] Avoid suggesting set rewrites for non-hashable types ([#9956](https://github.com/astral-sh/ruff/pull/9956))
- \[`ruff`\] Avoid false negatives with string literals inside of method calls (`RUF027`) ([#9865](https://github.com/astral-sh/ruff/pull/9865))
- \[`ruff`\] Fix panic on with f-string detection (`RUF027`) ([#9990](https://github.com/astral-sh/ruff/pull/9990))
- \[`ruff`\] Ignore builtins when detecting missing f-strings ([#9849](https://github.com/astral-sh/ruff/pull/9849))
### Performance
- Use `memchr` for string lexing ([#9888](https://github.com/astral-sh/ruff/pull/9888))
- Use `memchr` for tab-indentation detection ([#9853](https://github.com/astral-sh/ruff/pull/9853))
- Reduce `Result<Tok, LexicalError>` size by using `Box<str>` instead of `String` ([#9885](https://github.com/astral-sh/ruff/pull/9885))
- Reduce size of `Expr` from 80 to 64 bytes ([#9900](https://github.com/astral-sh/ruff/pull/9900))
- Improve trailing comma rule performance ([#9867](https://github.com/astral-sh/ruff/pull/9867))
- Remove unnecessary string cloning from the parser ([#9884](https://github.com/astral-sh/ruff/pull/9884))
## 0.2.1
This release includes support for range formatting (i.e., the ability to format specific lines

View File

@@ -26,6 +26,10 @@ Welcome! We're happy to have you here. Thank you in advance for your contributio
- [`cargo dev`](#cargo-dev)
- [Subsystems](#subsystems)
- [Compilation Pipeline](#compilation-pipeline)
- [Import Categorization](#import-categorization)
- [Project root](#project-root)
- [Package root](#package-root)
- [Import categorization](#import-categorization-1)
## The Basics
@@ -35,7 +39,7 @@ For small changes (e.g., bug fixes), feel free to submit a PR.
For larger changes (e.g., new lint rules, new functionality, new configuration options), consider
creating an [**issue**](https://github.com/astral-sh/ruff/issues) outlining your proposed change.
You can also join us on [**Discord**](https://discord.gg/c9MhzV8aU5) to discuss your idea with the
You can also join us on [**Discord**](https://discord.com/invite/astral-sh) to discuss your idea with the
community. We've labeled [beginner-friendly tasks](https://github.com/astral-sh/ruff/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
in the issue tracker, along with [bugs](https://github.com/astral-sh/ruff/issues?q=is%3Aissue+is%3Aopen+label%3Abug)
and [improvements](https://github.com/astral-sh/ruff/issues?q=is%3Aissue+is%3Aopen+label%3Aaccepted)
@@ -63,7 +67,7 @@ You'll also need [Insta](https://insta.rs/docs/) to update snapshot tests:
cargo install cargo-insta
```
and pre-commit to run some validation checks:
And you'll need pre-commit to run some validation checks:
```shell
pipx install pre-commit # or `pip install pre-commit` if you have a virtualenv
@@ -76,6 +80,16 @@ when making a commit:
pre-commit install
```
We recommend [nextest](https://nexte.st/) to run Ruff's test suite (via `cargo nextest run`),
though it's not strictly necessary:
```shell
cargo install cargo-nextest --locked
```
Throughout this guide, any usages of `cargo test` can be replaced with `cargo nextest run`,
if you choose to install `nextest`.
### Development
After cloning the repository, run Ruff locally from the repository root with:
@@ -373,6 +387,11 @@ We have several ways of benchmarking and profiling Ruff:
- Microbenchmarks which run the linter or the formatter on individual files. These run on pull requests.
- Profiling the linter on either the microbenchmarks or entire projects
> \[!NOTE\]
> When running benchmarks, ensure that your CPU is otherwise idle (e.g., close any background
> applications, like web browsers). You may also want to switch your CPU to a "performance"
> mode, if it exists, especially when benchmarking short-lived processes.
### CPython Benchmark
First, clone [CPython](https://github.com/python/cpython). It's a large and diverse Python codebase,

22
Cargo.lock generated
View File

@@ -273,9 +273,9 @@ dependencies = [
[[package]]
name = "chrono"
version = "0.4.33"
version = "0.4.34"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9f13690e35a5e4ace198e7beea2895d29f3a9cc55015fcebe6336bd2010af9eb"
checksum = "5bc015644b92d5890fab7489e49d21f879d5c990186827d42ec511919404f38b"
dependencies = [
"android-tzdata",
"iana-time-zone",
@@ -1037,9 +1037,9 @@ dependencies = [
[[package]]
name = "indicatif"
version = "0.17.7"
version = "0.17.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fb28741c9db9a713d93deb3bb9515c20788cef5815265bee4980e87bde7e0f25"
checksum = "763a5a8f45087d6bcea4222e7b72c291a054edf80e4ef6efd2a4979878c7bea3"
dependencies = [
"console",
"instant",
@@ -1979,7 +1979,7 @@ dependencies = [
[[package]]
name = "ruff"
version = "0.2.1"
version = "0.2.2"
dependencies = [
"anyhow",
"argfile",
@@ -2140,7 +2140,7 @@ dependencies = [
[[package]]
name = "ruff_linter"
version = "0.2.1"
version = "0.2.2"
dependencies = [
"aho-corasick",
"annotate-snippets 0.9.2",
@@ -2394,7 +2394,7 @@ dependencies = [
[[package]]
name = "ruff_shrinking"
version = "0.2.1"
version = "0.2.2"
dependencies = [
"anyhow",
"clap",
@@ -2944,18 +2944,18 @@ dependencies = [
[[package]]
name = "thiserror"
version = "1.0.56"
version = "1.0.57"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d54378c645627613241d077a3a79db965db602882668f9136ac42af9ecb730ad"
checksum = "1e45bcbe8ed29775f228095caf2cd67af7a4ccf756ebff23a306bf3e8b47b24b"
dependencies = [
"thiserror-impl",
]
[[package]]
name = "thiserror-impl"
version = "1.0.56"
version = "1.0.57"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fa0faa943b50f3db30a20aa7e265dbc66076993efed8463e8de414e5d06d3471"
checksum = "a953cb265bef375dae3de6663da4d3804eee9682ea80d8e2542529b73c531c81"
dependencies = [
"proc-macro2",
"quote",

View File

@@ -21,7 +21,7 @@ bincode = { version = "1.3.3" }
bitflags = { version = "2.4.1" }
bstr = { version = "1.9.0" }
cachedir = { version = "0.3.1" }
chrono = { version = "0.4.33", default-features = false, features = ["clock"] }
chrono = { version = "0.4.34", default-features = false, features = ["clock"] }
clap = { version = "4.4.18", features = ["derive"] }
clap_complete_command = { version = "0.5.1" }
clearscreen = { version = "2.0.0" }
@@ -44,7 +44,7 @@ hexf-parse = { version ="0.2.1"}
ignore = { version = "0.4.22" }
imara-diff ={ version = "0.1.5"}
imperative = { version = "1.0.4" }
indicatif ={ version = "0.17.7"}
indicatif ={ version = "0.17.8"}
indoc ={ version = "2.0.4"}
insta = { version = "1.34.0", feature = ["filters", "glob"] }
insta-cmd = { version = "0.4.0" }
@@ -92,7 +92,7 @@ strum_macros = { version = "0.25.3" }
syn = { version = "2.0.40" }
tempfile = { version ="3.9.0"}
test-case = { version = "3.3.1" }
thiserror = { version = "1.0.51" }
thiserror = { version = "1.0.57" }
tikv-jemallocator = { version ="0.5.0"}
toml = { version = "0.8.9" }
tracing = { version = "0.1.40" }

View File

@@ -8,7 +8,7 @@
[![image](https://img.shields.io/pypi/pyversions/ruff.svg)](https://pypi.python.org/pypi/ruff)
[![Actions status](https://github.com/astral-sh/ruff/workflows/CI/badge.svg)](https://github.com/astral-sh/ruff/actions)
[**Discord**](https://discord.gg/c9MhzV8aU5) | [**Docs**](https://docs.astral.sh/ruff/) | [**Playground**](https://play.ruff.rs/)
[**Discord**](https://discord.com/invite/astral-sh) | [**Docs**](https://docs.astral.sh/ruff/) | [**Playground**](https://play.ruff.rs/)
An extremely fast Python linter and code formatter, written in Rust.
@@ -150,7 +150,7 @@ Ruff can also be used as a [pre-commit](https://pre-commit.com/) hook via [`ruff
```yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.2.1
rev: v0.2.2
hooks:
# Run the linter.
- id: ruff
@@ -341,14 +341,14 @@ For a complete enumeration of the supported rules, see [_Rules_](https://docs.as
Contributions are welcome and highly appreciated. To get started, check out the
[**contributing guidelines**](https://docs.astral.sh/ruff/contributing/).
You can also join us on [**Discord**](https://discord.gg/c9MhzV8aU5).
You can also join us on [**Discord**](https://discord.com/invite/astral-sh).
## Support
Having trouble? Check out the existing issues on [**GitHub**](https://github.com/astral-sh/ruff/issues),
or feel free to [**open a new one**](https://github.com/astral-sh/ruff/issues/new).
You can also ask for help on [**Discord**](https://discord.gg/c9MhzV8aU5).
You can also ask for help on [**Discord**](https://discord.com/invite/astral-sh).
## Acknowledgements

View File

@@ -1,6 +1,6 @@
[package]
name = "ruff"
version = "0.2.1"
version = "0.2.2"
publish = false
authors = { workspace = true }
edition = { workspace = true }
@@ -48,6 +48,7 @@ serde = { workspace = true }
serde_json = { workspace = true }
shellexpand = { workspace = true }
strum = { workspace = true, features = [] }
tempfile = { workspace = true }
thiserror = { workspace = true }
toml = { workspace = true }
tracing = { workspace = true, features = ["log"] }

View File

@@ -1,7 +1,7 @@
use std::fmt::Debug;
use std::fs::{self, File};
use std::hash::Hasher;
use std::io::{self, BufReader, BufWriter, Write};
use std::io::{self, BufReader, Write};
use std::path::{Path, PathBuf};
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Mutex;
@@ -15,6 +15,7 @@ use rayon::iter::ParallelIterator;
use rayon::iter::{IntoParallelIterator, ParallelBridge};
use rustc_hash::FxHashMap;
use serde::{Deserialize, Serialize};
use tempfile::NamedTempFile;
use ruff_cache::{CacheKey, CacheKeyHasher};
use ruff_diagnostics::{DiagnosticKind, Fix};
@@ -165,15 +166,29 @@ impl Cache {
return Ok(());
}
let file = File::create(&self.path)
.with_context(|| format!("Failed to create cache file '{}'", self.path.display()))?;
let writer = BufWriter::new(file);
bincode::serialize_into(writer, &self.package).with_context(|| {
// Write the cache to a temporary file first and then rename it for an "atomic" write.
// Protects against data loss if the process is killed during the write and races between different ruff
// processes, resulting in a corrupted cache file. https://github.com/astral-sh/ruff/issues/8147#issuecomment-1943345964
let mut temp_file =
NamedTempFile::new_in(self.path.parent().expect("Write path must have a parent"))
.context("Failed to create temporary file")?;
// Serialize to in-memory buffer because hyperfine benchmark showed that it's faster than
// using a `BufWriter` and our cache files are small enough that streaming isn't necessary.
let serialized =
bincode::serialize(&self.package).context("Failed to serialize cache data")?;
temp_file
.write_all(&serialized)
.context("Failed to write serialized cache to temporary file.")?;
temp_file.persist(&self.path).with_context(|| {
format!(
"Failed to serialise cache to file '{}'",
"Failed to rename temporary cache file to {}",
self.path.display()
)
})
})?;
Ok(())
}
/// Applies the pending changes without storing the cache to disk.

View File

@@ -25,6 +25,15 @@ import cycles. They also increase the cognitive load of reading the code.
If an import statement is used to check for the availability or existence
of a module, consider using `importlib.util.find_spec` instead.
If an import statement is used to re-export a symbol as part of a module's
public interface, consider using a "redundant" import alias, which
instructs Ruff (and other tools) to respect the re-export, and avoid
marking it as unused, as in:
```python
from module import member as member
```
## Example
```python
import numpy as np # unused import
@@ -51,11 +60,12 @@ else:
```
## Options
- `lint.pyflakes.extend-generics`
- `lint.ignore-init-module-imports`
## References
- [Python documentation: `import`](https://docs.python.org/3/reference/simple_stmts.html#the-import-statement)
- [Python documentation: `importlib.util.find_spec`](https://docs.python.org/3/library/importlib.html#importlib.util.find_spec)
- [Typing documentation: interface conventions](https://typing.readthedocs.io/en/latest/source/libraries.html#library-interface-public-and-private-symbols)
----- stderr -----

View File

@@ -13,6 +13,7 @@ license = { workspace = true }
[lib]
bench = false
doctest = false
[[bench]]
name = "linter"

View File

@@ -11,6 +11,7 @@ repository = { workspace = true }
license = { workspace = true }
[lib]
doctest = false
[dependencies]
ruff_text_size = { path = "../ruff_text_size" }

View File

@@ -11,6 +11,7 @@ repository = { workspace = true }
license = { workspace = true }
[lib]
doctest = false
[dependencies]
ruff_macros = { path = "../ruff_macros" }

View File

@@ -1,6 +1,6 @@
[package]
name = "ruff_linter"
version = "0.2.1"
version = "0.2.2"
publish = false
authors = { workspace = true }
edition = { workspace = true }

View File

@@ -11,13 +11,25 @@ class _UnusedTypedDict2(typing.TypedDict):
class _UsedTypedDict(TypedDict):
foo: bytes
foo: bytes
class _CustomClass(_UsedTypedDict):
bar: list[int]
_UnusedTypedDict3 = TypedDict("_UnusedTypedDict3", {"foo": int})
_UsedTypedDict3 = TypedDict("_UsedTypedDict3", {"bar": bytes})
def uses_UsedTypedDict3(arg: _UsedTypedDict3) -> None: ...
# In `.py` files, we don't flag unused definitions in class scopes (unlike in `.pyi`
# files).
class _CustomClass3:
class _UnusedTypeDict4(TypedDict):
pass
def method(self) -> None:
_CustomClass3._UnusedTypeDict4()

View File

@@ -35,3 +35,13 @@ _UnusedTypedDict3 = TypedDict("_UnusedTypedDict3", {"foo": int})
_UsedTypedDict3 = TypedDict("_UsedTypedDict3", {"bar": bytes})
def uses_UsedTypedDict3(arg: _UsedTypedDict3) -> None: ...
# In `.pyi` files, we flag unused definitions in class scopes as well as in the global
# scope (unlike in `.py` files).
class _CustomClass3:
class _UnusedTypeDict4(TypedDict):
pass
def method(self) -> None:
_CustomClass3._UnusedTypeDict4()

View File

@@ -193,3 +193,11 @@ def func():
for y in range(5):
g(x, idx)
idx += 1
async def func():
# OK (for loop is async)
idx = 0
async for x in async_gen():
g(x, idx)
idx += 1

View File

@@ -36,35 +36,47 @@ for i in list( # Comment
): # PERF101
pass
for i in list(foo_dict): # Ok
for i in list(foo_dict): # OK
pass
for i in list(1): # Ok
for i in list(1): # OK
pass
for i in list(foo_int): # Ok
for i in list(foo_int): # OK
pass
import itertools
for i in itertools.product(foo_int): # Ok
for i in itertools.product(foo_int): # OK
pass
for i in list(foo_list): # Ok
for i in list(foo_list): # OK
foo_list.append(i + 1)
for i in list(foo_list): # PERF101
# Make sure we match the correct list
other_list.append(i + 1)
for i in list(foo_tuple): # Ok
for i in list(foo_tuple): # OK
foo_tuple.append(i + 1)
for i in list(foo_set): # Ok
for i in list(foo_set): # OK
foo_set.append(i + 1)
x, y, nested_tuple = (1, 2, (3, 4, 5))
for i in list(nested_tuple): # PERF101
pass
for i in list(foo_list): # OK
if True:
foo_list.append(i + 1)
for i in list(foo_list): # OK
if True:
foo_list[i] = i + 1
for i in list(foo_list): # OK
if True:
del foo_list[i + 1]

View File

@@ -4,7 +4,12 @@
1 in (
1, 2, 3
)
# OK
fruits = ["cherry", "grapes"]
"cherry" in fruits
_ = {key: value for key, value in {"a": 1, "b": 2}.items() if key in ("a", "b")}
# OK
fruits in [[1, 2, 3], [4, 5, 6]]
fruits in [1, 2, 3]
1 in [[1, 2, 3], [4, 5, 6]]
_ = {key: value for key, value in {"a": 1, "b": 2}.items() if key in (["a", "b"], ["c", "d"])}

View File

@@ -35,6 +35,15 @@ if argc != 0: # correct
if argc != 1: # correct
pass
if argc != -1.0: # correct
pass
if argc != 0.0: # correct
pass
if argc != 1.0: # correct
pass
if argc != 2: # [magic-value-comparison]
pass
@@ -44,6 +53,12 @@ if argc != -2: # [magic-value-comparison]
if argc != +2: # [magic-value-comparison]
pass
if argc != -2.0: # [magic-value-comparison]
pass
if argc != +2.0: # [magic-value-comparison]
pass
if __name__ == "__main__": # correct
pass

View File

@@ -0,0 +1,75 @@
import codecs
import io
from pathlib import Path
# Errors
with open("FURB129.py") as f:
for _line in f.readlines():
pass
a = [line.lower() for line in f.readlines()]
b = {line.upper() for line in f.readlines()}
c = {line.lower(): line.upper() for line in f.readlines()}
with Path("FURB129.py").open() as f:
for _line in f.readlines():
pass
for _line in open("FURB129.py").readlines():
pass
for _line in Path("FURB129.py").open().readlines():
pass
def func():
f = Path("FURB129.py").open()
for _line in f.readlines():
pass
f.close()
def func(f: io.BytesIO):
for _line in f.readlines():
pass
def func():
with (open("FURB129.py") as f, foo as bar):
for _line in f.readlines():
pass
for _line in bar.readlines():
pass
# False positives
def func(f):
for _line in f.readlines():
pass
def func(f: codecs.StreamReader):
for _line in f.readlines():
pass
def func():
class A:
def readlines(self) -> list[str]:
return ["a", "b", "c"]
return A()
for _line in func().readlines():
pass
# OK
for _line in ["a", "b", "c"]:
pass
with open("FURB129.py") as f:
for _line in f:
pass
for _line in f.readlines(10):
pass
for _not_line in f.readline():
pass

View File

@@ -162,3 +162,26 @@ async def f(x: bool):
T = asyncio.create_task(asyncio.sleep(1))
else:
T = None
# Error
def f():
loop = asyncio.new_event_loop()
loop.create_task(main()) # Error
# Error
def f():
loop = asyncio.get_event_loop()
loop.create_task(main()) # Error
# OK
def f():
global task
loop = asyncio.new_event_loop()
task = loop.create_task(main()) # Error
# OK
def f():
global task
loop = asyncio.get_event_loop()
task = loop.create_task(main()) # Error

View File

@@ -68,3 +68,7 @@ def method_calls():
first = "Wendy"
last = "Appleseed"
value.method("{first} {last}") # RUF027
def format_specifiers():
a = 4
b = "{a:b} {a:^5}"

View File

@@ -0,0 +1,2 @@
# 测试eval函数,eval()函数用来执行一个字符串表达式并返t表达式的值。另外可以讲字符串转换成列表或元组或字典
a = "{1: 'a', 2: 'b'}"

View File

@@ -2,11 +2,14 @@ use ruff_python_ast::Comprehension;
use crate::checkers::ast::Checker;
use crate::codes::Rule;
use crate::rules::flake8_simplify;
use crate::rules::{flake8_simplify, refurb};
/// Run lint rules over a [`Comprehension`] syntax nodes.
pub(crate) fn comprehension(comprehension: &Comprehension, checker: &mut Checker) {
if checker.enabled(Rule::InDictKeys) {
flake8_simplify::rules::key_in_dict_comprehension(checker, comprehension);
}
if checker.enabled(Rule::ReadlinesInFor) {
refurb::rules::readlines_in_comprehension(checker, comprehension);
}
}

View File

@@ -281,17 +281,21 @@ pub(crate) fn deferred_scopes(checker: &mut Checker) {
}
}
if checker.enabled(Rule::UnusedPrivateTypeVar) {
flake8_pyi::rules::unused_private_type_var(checker, scope, &mut diagnostics);
}
if checker.enabled(Rule::UnusedPrivateProtocol) {
flake8_pyi::rules::unused_private_protocol(checker, scope, &mut diagnostics);
}
if checker.enabled(Rule::UnusedPrivateTypeAlias) {
flake8_pyi::rules::unused_private_type_alias(checker, scope, &mut diagnostics);
}
if checker.enabled(Rule::UnusedPrivateTypedDict) {
flake8_pyi::rules::unused_private_typed_dict(checker, scope, &mut diagnostics);
if checker.source_type.is_stub()
|| matches!(scope.kind, ScopeKind::Module | ScopeKind::Function(_))
{
if checker.enabled(Rule::UnusedPrivateTypeVar) {
flake8_pyi::rules::unused_private_type_var(checker, scope, &mut diagnostics);
}
if checker.enabled(Rule::UnusedPrivateProtocol) {
flake8_pyi::rules::unused_private_protocol(checker, scope, &mut diagnostics);
}
if checker.enabled(Rule::UnusedPrivateTypeAlias) {
flake8_pyi::rules::unused_private_type_alias(checker, scope, &mut diagnostics);
}
if checker.enabled(Rule::UnusedPrivateTypedDict) {
flake8_pyi::rules::unused_private_typed_dict(checker, scope, &mut diagnostics);
}
}
if checker.enabled(Rule::AsyncioDanglingTask) {

View File

@@ -1317,6 +1317,9 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
if checker.enabled(Rule::UnnecessaryDictIndexLookup) {
pylint::rules::unnecessary_dict_index_lookup(checker, for_stmt);
}
if checker.enabled(Rule::ReadlinesInFor) {
refurb::rules::readlines_in_for(checker, for_stmt);
}
if !is_async {
if checker.enabled(Rule::ReimplementedBuiltin) {
flake8_simplify::rules::convert_for_loop_to_any_all(checker, stmt);

View File

@@ -2,10 +2,16 @@ use ruff_python_ast::StringLike;
use crate::checkers::ast::Checker;
use crate::codes::Rule;
use crate::rules::{flake8_bandit, flake8_pyi};
use crate::rules::{flake8_bandit, flake8_pyi, ruff};
/// Run lint rules over a [`StringLike`] syntax nodes.
pub(crate) fn string_like(string_like: StringLike, checker: &mut Checker) {
if checker.any_enabled(&[
Rule::AmbiguousUnicodeCharacterString,
Rule::AmbiguousUnicodeCharacterDocstring,
]) {
ruff::rules::ambiguous_unicode_character_string(checker, string_like);
}
if checker.enabled(Rule::HardcodedBindAllInterfaces) {
flake8_bandit::rules::hardcoded_bind_all_interfaces(checker, string_like);
}

View File

@@ -40,7 +40,7 @@ use ruff_diagnostics::{Diagnostic, IsolationLevel};
use ruff_notebook::{CellOffsets, NotebookIndex};
use ruff_python_ast::all::{extract_all_names, DunderAllFlags};
use ruff_python_ast::helpers::{
collect_import_from_member, extract_handled_exceptions, to_module_path,
collect_import_from_member, extract_handled_exceptions, is_docstring_stmt, to_module_path,
};
use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::str::trailing_quote;
@@ -71,6 +71,38 @@ mod analyze;
mod annotation;
mod deferred;
/// State representing whether a docstring is expected or not for the next statement.
#[derive(Default, Debug, Copy, Clone, PartialEq)]
enum DocstringState {
/// The next statement is expected to be a docstring, but not necessarily so.
///
/// For example, in the following code:
///
/// ```python
/// class Foo:
/// pass
///
///
/// def bar(x, y):
/// """Docstring."""
/// return x + y
/// ```
///
/// For `Foo`, the state is expected when the checker is visiting the class
/// body but isn't going to be present. While, for `bar` function, the docstring
/// is expected and present.
#[default]
Expected,
Other,
}
impl DocstringState {
/// Returns `true` if the next statement is expected to be a docstring.
const fn is_expected(self) -> bool {
matches!(self, DocstringState::Expected)
}
}
pub(crate) struct Checker<'a> {
/// The [`Path`] to the file under analysis.
path: &'a Path,
@@ -114,6 +146,8 @@ pub(crate) struct Checker<'a> {
pub(crate) flake8_bugbear_seen: Vec<TextRange>,
/// The end offset of the last visited statement.
last_stmt_end: TextSize,
/// A state describing if a docstring is expected or not.
docstring_state: DocstringState,
}
impl<'a> Checker<'a> {
@@ -153,6 +187,7 @@ impl<'a> Checker<'a> {
cell_offsets,
notebook_index,
last_stmt_end: TextSize::default(),
docstring_state: DocstringState::default(),
}
}
}
@@ -305,19 +340,16 @@ where
self.semantic.flags -= SemanticModelFlags::IMPORT_BOUNDARY;
}
// Track whether we've seen docstrings, non-imports, etc.
// Track whether we've seen module docstrings, non-imports, etc.
match stmt {
Stmt::Expr(ast::StmtExpr { value, .. })
if !self
.semantic
.flags
.intersects(SemanticModelFlags::MODULE_DOCSTRING)
if !self.semantic.seen_module_docstring_boundary()
&& value.is_string_literal_expr() =>
{
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING;
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING_BOUNDARY;
}
Stmt::ImportFrom(ast::StmtImportFrom { module, names, .. }) => {
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING;
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING_BOUNDARY;
// Allow __future__ imports until we see a non-__future__ import.
if let Some("__future__") = module.as_deref() {
@@ -332,11 +364,11 @@ where
}
}
Stmt::Import(_) => {
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING;
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING_BOUNDARY;
self.semantic.flags |= SemanticModelFlags::FUTURES_BOUNDARY;
}
_ => {
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING;
self.semantic.flags |= SemanticModelFlags::MODULE_DOCSTRING_BOUNDARY;
self.semantic.flags |= SemanticModelFlags::FUTURES_BOUNDARY;
if !(self.semantic.seen_import_boundary()
|| helpers::is_assignment_to_a_dunder(stmt)
@@ -353,6 +385,16 @@ where
// the node.
let flags_snapshot = self.semantic.flags;
// Update the semantic model if it is in a docstring. This should be done after the
// flags snapshot to ensure that it gets reset once the statement is analyzed.
if self.docstring_state.is_expected() {
if is_docstring_stmt(stmt) {
self.semantic.flags |= SemanticModelFlags::DOCSTRING;
}
// Reset the state irrespective of whether the statement is a docstring or not.
self.docstring_state = DocstringState::Other;
}
// Step 1: Binding
match stmt {
Stmt::AugAssign(ast::StmtAugAssign {
@@ -654,6 +696,8 @@ where
self.semantic.set_globals(globals);
}
// Set the docstring state before visiting the class body.
self.docstring_state = DocstringState::Expected;
self.visit_body(body);
}
Stmt::TypeAlias(ast::StmtTypeAlias {
@@ -1288,6 +1332,16 @@ where
self.semantic.flags |= SemanticModelFlags::F_STRING;
visitor::walk_expr(self, expr);
}
Expr::NamedExpr(ast::ExprNamedExpr {
target,
value,
range: _,
}) => {
self.visit_expr(value);
self.semantic.flags |= SemanticModelFlags::NAMED_EXPRESSION_ASSIGNMENT;
self.visit_expr(target);
}
_ => visitor::walk_expr(self, expr),
}
@@ -1504,6 +1558,8 @@ impl<'a> Checker<'a> {
unreachable!("Generator expression must contain at least one generator");
};
let flags = self.semantic.flags;
// Generators are compiled as nested functions. (This may change with PEP 709.)
// As such, the `iter` of the first generator is evaluated in the outer scope, while all
// subsequent nodes are evaluated in the inner scope.
@@ -1533,14 +1589,22 @@ impl<'a> Checker<'a> {
// `x` is local to `foo`, and the `T` in `y=T` skips the class scope when resolving.
self.visit_expr(&generator.iter);
self.semantic.push_scope(ScopeKind::Generator);
self.semantic.flags = flags | SemanticModelFlags::COMPREHENSION_ASSIGNMENT;
self.visit_expr(&generator.target);
self.semantic.flags = flags;
for expr in &generator.ifs {
self.visit_boolean_test(expr);
}
for generator in iterator {
self.visit_expr(&generator.iter);
self.semantic.flags = flags | SemanticModelFlags::COMPREHENSION_ASSIGNMENT;
self.visit_expr(&generator.target);
self.semantic.flags = flags;
for expr in &generator.ifs {
self.visit_boolean_test(expr);
}
@@ -1739,11 +1803,21 @@ impl<'a> Checker<'a> {
return;
}
// A binding within a `for` must be a loop variable, as in:
// ```python
// for x in range(10):
// ...
// ```
if parent.is_for_stmt() {
self.add_binding(id, expr.range(), BindingKind::LoopVar, flags);
return;
}
// A binding within a `with` must be an item, as in:
// ```python
// with open("file.txt") as fp:
// ...
// ```
if parent.is_with_stmt() {
self.add_binding(id, expr.range(), BindingKind::WithItemVar, flags);
return;
@@ -1799,17 +1873,26 @@ impl<'a> Checker<'a> {
}
// If the expression is the left-hand side of a walrus operator, then it's a named
// expression assignment.
if self
.semantic
.current_expressions()
.filter_map(Expr::as_named_expr_expr)
.any(|parent| parent.target.as_ref() == expr)
{
// expression assignment, as in:
// ```python
// if (x := 10) > 5:
// ...
// ```
if self.semantic.in_named_expression_assignment() {
self.add_binding(id, expr.range(), BindingKind::NamedExprAssignment, flags);
return;
}
// If the expression is part of a comprehension target, then it's a comprehension variable
// assignment, as in:
// ```python
// [x for x in range(10)]
// ```
if self.semantic.in_comprehension_assignment() {
self.add_binding(id, expr.range(), BindingKind::ComprehensionVar, flags);
return;
}
self.add_binding(id, expr.range(), BindingKind::Assignment, flags);
}
@@ -1925,6 +2008,8 @@ impl<'a> Checker<'a> {
};
self.visit_parameters(parameters);
// Set the docstring state before visiting the function body.
self.docstring_state = DocstringState::Expected;
self.visit_body(body);
}
}

View File

@@ -6,17 +6,14 @@ use ruff_notebook::CellOffsets;
use ruff_python_ast::PySourceType;
use ruff_python_codegen::Stylist;
use ruff_python_parser::lexer::LexResult;
use ruff_python_parser::Tok;
use ruff_diagnostics::Diagnostic;
use ruff_python_index::Indexer;
use ruff_source_file::Locator;
use crate::directives::TodoComment;
use crate::lex::docstring_detection::StateMachine;
use crate::registry::{AsRule, Rule};
use crate::rules::pycodestyle::rules::BlankLinesChecker;
use crate::rules::ruff::rules::Context;
use crate::rules::{
eradicate, flake8_commas, flake8_executable, flake8_fixme, flake8_implicit_str_concat,
flake8_pyi, flake8_quotes, flake8_todos, pycodestyle, pygrep_hooks, pylint, pyupgrade, ruff,
@@ -66,31 +63,15 @@ pub(crate) fn check_tokens(
pylint::rules::empty_comments(&mut diagnostics, indexer, locator);
}
if settings.rules.any_enabled(&[
Rule::AmbiguousUnicodeCharacterString,
Rule::AmbiguousUnicodeCharacterDocstring,
Rule::AmbiguousUnicodeCharacterComment,
]) {
let mut state_machine = StateMachine::default();
for &(ref tok, range) in tokens.iter().flatten() {
let is_docstring = state_machine.consume(tok);
let context = match tok {
Tok::String { .. } => {
if is_docstring {
Context::Docstring
} else {
Context::String
}
}
Tok::FStringMiddle { .. } => Context::String,
Tok::Comment(_) => Context::Comment,
_ => continue,
};
ruff::rules::ambiguous_unicode_character(
if settings
.rules
.enabled(Rule::AmbiguousUnicodeCharacterComment)
{
for range in indexer.comment_ranges() {
ruff::rules::ambiguous_unicode_character_comment(
&mut diagnostics,
locator,
range,
context,
*range,
settings,
);
}

View File

@@ -1025,6 +1025,7 @@ pub fn code_to_rule(linter: Linter, code: &str) -> Option<(RuleGroup, Rule)> {
#[allow(deprecated)]
(Refurb, "113") => (RuleGroup::Nursery, rules::refurb::rules::RepeatedAppend),
(Refurb, "118") => (RuleGroup::Preview, rules::refurb::rules::ReimplementedOperator),
(Refurb, "129") => (RuleGroup::Preview, rules::refurb::rules::ReadlinesInFor),
#[allow(deprecated)]
(Refurb, "131") => (RuleGroup::Nursery, rules::refurb::rules::DeleteFullSlice),
#[allow(deprecated)]

View File

@@ -256,8 +256,6 @@ impl Rule {
| Rule::MixedSpacesAndTabs
| Rule::TrailingWhitespace => LintSource::PhysicalLines,
Rule::AmbiguousUnicodeCharacterComment
| Rule::AmbiguousUnicodeCharacterDocstring
| Rule::AmbiguousUnicodeCharacterString
| Rule::AvoidableEscapedQuote
| Rule::BadQuotesDocstring
| Rule::BadQuotesInlineString

View File

@@ -248,6 +248,7 @@ impl Renamer {
| BindingKind::Assignment
| BindingKind::BoundException
| BindingKind::LoopVar
| BindingKind::ComprehensionVar
| BindingKind::WithItemVar
| BindingKind::Global
| BindingKind::Nonlocal(_)

View File

@@ -118,8 +118,7 @@ fn is_open_call_from_pathlib(func: &Expr, semantic: &SemanticModel) -> bool {
let binding = semantic.binding(binding_id);
let Some(Expr::Call(call)) = analyze::typing::find_binding_value(&name.id, binding, semantic)
else {
let Some(Expr::Call(call)) = analyze::typing::find_binding_value(binding, semantic) else {
return false;
};

View File

@@ -15,13 +15,11 @@ PYI049.py:9:7: PYI049 Private TypedDict `_UnusedTypedDict2` is never used
10 | bar: int
|
PYI049.py:20:1: PYI049 Private TypedDict `_UnusedTypedDict3` is never used
PYI049.py:21:1: PYI049 Private TypedDict `_UnusedTypedDict3` is never used
|
18 | bar: list[int]
19 |
20 | _UnusedTypedDict3 = TypedDict("_UnusedTypedDict3", {"foo": int})
21 | _UnusedTypedDict3 = TypedDict("_UnusedTypedDict3", {"foo": int})
| ^^^^^^^^^^^^^^^^^ PYI049
21 | _UsedTypedDict3 = TypedDict("_UsedTypedDict3", {"bar": bytes})
22 | _UsedTypedDict3 = TypedDict("_UsedTypedDict3", {"bar": bytes})
|

View File

@@ -24,4 +24,13 @@ PYI049.pyi:34:1: PYI049 Private TypedDict `_UnusedTypedDict3` is never used
35 | _UsedTypedDict3 = TypedDict("_UsedTypedDict3", {"bar": bytes})
|
PYI049.pyi:43:11: PYI049 Private TypedDict `_UnusedTypeDict4` is never used
|
41 | # scope (unlike in `.py` files).
42 | class _CustomClass3:
43 | class _UnusedTypeDict4(TypedDict):
| ^^^^^^^^^^^^^^^^ PYI049
44 | pass
|

View File

@@ -49,6 +49,11 @@ impl Violation for EnumerateForLoop {
/// SIM113
pub(crate) fn enumerate_for_loop(checker: &mut Checker, for_stmt: &ast::StmtFor) {
// If the loop is async, abort.
if for_stmt.is_async {
return;
}
// If the loop contains a `continue`, abort.
let mut visitor = LoopControlFlowVisitor::default();
visitor.visit_body(&for_stmt.body);
@@ -76,8 +81,7 @@ pub(crate) fn enumerate_for_loop(checker: &mut Checker, for_stmt: &ast::StmtFor)
}
// Ensure that the index variable was initialized to 0.
let Some(value) = typing::find_binding_value(&index.id, binding, checker.semantic())
else {
let Some(value) = typing::find_binding_value(binding, checker.semantic()) else {
continue;
};
if !matches!(

View File

@@ -419,23 +419,20 @@ mod tests {
Ok(())
}
// Test currently disabled as line endings are automatically converted to
// platform-appropriate ones in CI/CD #[test_case(Path::new("
// line_ending_crlf.py"))] #[test_case(Path::new("line_ending_lf.py"))]
// fn source_code_style(path: &Path) -> Result<()> {
// let snapshot = format!("{}", path.to_string_lossy());
// let diagnostics = test_path(
// Path::new("isort")
// .join(path)
// .as_path(),
// &LinterSettings {
// src: vec![test_resource_path("fixtures/isort")],
// ..LinterSettings::for_rule(Rule::UnsortedImports)
// },
// )?;
// crate::assert_messages!(snapshot, diagnostics);
// Ok(())
// }
#[test_case(Path::new("line_ending_crlf.py"))]
#[test_case(Path::new("line_ending_lf.py"))]
fn source_code_style(path: &Path) -> Result<()> {
let snapshot = format!("{}", path.to_string_lossy());
let diagnostics = test_path(
Path::new("isort").join(path).as_path(),
&LinterSettings {
src: vec![test_resource_path("fixtures/isort")],
..LinterSettings::for_rule(Rule::UnsortedImports)
},
)?;
crate::assert_messages!(snapshot, diagnostics);
Ok(())
}
#[test_case(Path::new("separate_local_folder_imports.py"))]
fn known_local_folder(path: &Path) -> Result<()> {

View File

@@ -0,0 +1,23 @@
---
source: crates/ruff_linter/src/rules/isort/mod.rs
---
line_ending_crlf.py:1:1: I001 [*] Import block is un-sorted or un-formatted
|
1 | / from long_module_name import member_one, member_two, member_three, member_four, member_five
2 | |
| |_^ I001
|
= help: Organize imports
Safe fix
1 |-from long_module_name import member_one, member_two, member_three, member_four, member_five
1 |+from long_module_name import (
2 |+ member_five,
3 |+ member_four,
4 |+ member_one,
5 |+ member_three,
6 |+ member_two,
7 |+)
2 8 |

View File

@@ -0,0 +1,23 @@
---
source: crates/ruff_linter/src/rules/isort/mod.rs
---
line_ending_lf.py:1:1: I001 [*] Import block is un-sorted or un-formatted
|
1 | / from long_module_name import member_one, member_two, member_three, member_four, member_five
2 | |
| |_^ I001
|
= help: Organize imports
Safe fix
1 |-from long_module_name import member_one, member_two, member_three, member_four, member_five
1 |+from long_module_name import (
2 |+ member_five,
3 |+ member_four,
4 |+ member_one,
5 |+ member_three,
6 |+ member_two,
7 |+)
2 8 |

View File

@@ -47,6 +47,7 @@ pub(super) fn test_expression(expr: &Expr, semantic: &SemanticModel) -> Resoluti
| BindingKind::Assignment
| BindingKind::NamedExprAssignment
| BindingKind::LoopVar
| BindingKind::ComprehensionVar
| BindingKind::Global
| BindingKind::Nonlocal(_) => Resolution::RelevantLocal,
BindingKind::Import(import) if matches!(import.call_path(), ["pandas"]) => {

View File

@@ -1,5 +1,6 @@
use ruff_diagnostics::{AlwaysFixableViolation, Diagnostic, Edit, Fix};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::statement_visitor::{walk_stmt, StatementVisitor};
use ruff_python_ast::{self as ast, Arguments, Expr, Stmt};
use ruff_python_semantic::analyze::typing::find_assigned_value;
use ruff_text_size::TextRange;
@@ -98,22 +99,25 @@ pub(crate) fn unnecessary_list_cast(checker: &mut Checker, iter: &Expr, body: &[
range: iterable_range,
..
}) => {
// If the variable is being appended to, don't suggest removing the cast:
//
// ```python
// items = ["foo", "bar"]
// for item in list(items):
// items.append("baz")
// ```
//
// Here, removing the `list()` cast would change the behavior of the code.
if body.iter().any(|stmt| match_append(stmt, id)) {
return;
}
let Some(value) = find_assigned_value(id, checker.semantic()) else {
return;
};
if matches!(value, Expr::Tuple(_) | Expr::List(_) | Expr::Set(_)) {
// If the variable is being modified to, don't suggest removing the cast:
//
// ```python
// items = ["foo", "bar"]
// for item in list(items):
// items.append("baz")
// ```
//
// Here, removing the `list()` cast would change the behavior of the code.
let mut visitor = MutationVisitor::new(id);
visitor.visit_body(body);
if visitor.is_mutated {
return;
}
let mut diagnostic = Diagnostic::new(UnnecessaryListCast, *list_range);
diagnostic.set_fix(remove_cast(*list_range, *iterable_range));
checker.diagnostics.push(diagnostic);
@@ -123,28 +127,6 @@ pub(crate) fn unnecessary_list_cast(checker: &mut Checker, iter: &Expr, body: &[
}
}
/// Check if a statement is an `append` call to a given identifier.
///
/// For example, `foo.append(bar)` would return `true` if `id` is `foo`.
fn match_append(stmt: &Stmt, id: &str) -> bool {
let Some(ast::StmtExpr { value, .. }) = stmt.as_expr_stmt() else {
return false;
};
let Some(ast::ExprCall { func, .. }) = value.as_call_expr() else {
return false;
};
let Some(ast::ExprAttribute { value, attr, .. }) = func.as_attribute_expr() else {
return false;
};
if attr != "append" {
return false;
}
let Some(ast::ExprName { id: target_id, .. }) = value.as_name_expr() else {
return false;
};
target_id == id
}
/// Generate a [`Fix`] to remove a `list` cast from an expression.
fn remove_cast(list_range: TextRange, iterable_range: TextRange) -> Fix {
Fix::safe_edits(
@@ -152,3 +134,95 @@ fn remove_cast(list_range: TextRange, iterable_range: TextRange) -> Fix {
[Edit::deletion(iterable_range.end(), list_range.end())],
)
}
/// A [`StatementVisitor`] that (conservatively) identifies mutations to a variable.
#[derive(Default)]
pub(crate) struct MutationVisitor<'a> {
pub(crate) target: &'a str,
pub(crate) is_mutated: bool,
}
impl<'a> MutationVisitor<'a> {
pub(crate) fn new(target: &'a str) -> Self {
Self {
target,
is_mutated: false,
}
}
}
impl<'a, 'b> StatementVisitor<'b> for MutationVisitor<'a>
where
'b: 'a,
{
fn visit_stmt(&mut self, stmt: &'b Stmt) {
if match_mutation(stmt, self.target) {
self.is_mutated = true;
} else {
walk_stmt(self, stmt);
}
}
}
/// Check if a statement is (probably) a modification to the list assigned to the given identifier.
///
/// For example, `foo.append(bar)` would return `true` if `id` is `foo`.
fn match_mutation(stmt: &Stmt, id: &str) -> bool {
match stmt {
// Ex) `foo.append(bar)`
Stmt::Expr(ast::StmtExpr { value, .. }) => {
let Some(ast::ExprCall { func, .. }) = value.as_call_expr() else {
return false;
};
let Some(ast::ExprAttribute { value, attr, .. }) = func.as_attribute_expr() else {
return false;
};
if !matches!(
attr.as_str(),
"append" | "insert" | "extend" | "remove" | "pop" | "clear" | "reverse" | "sort"
) {
return false;
}
let Some(ast::ExprName { id: target_id, .. }) = value.as_name_expr() else {
return false;
};
target_id == id
}
// Ex) `foo[0] = bar`
Stmt::Assign(ast::StmtAssign { targets, .. }) => targets.iter().any(|target| {
if let Some(ast::ExprSubscript { value: target, .. }) = target.as_subscript_expr() {
if let Some(ast::ExprName { id: target_id, .. }) = target.as_name_expr() {
return target_id == id;
}
}
false
}),
// Ex) `foo += bar`
Stmt::AugAssign(ast::StmtAugAssign { target, .. }) => {
if let Some(ast::ExprName { id: target_id, .. }) = target.as_name_expr() {
target_id == id
} else {
false
}
}
// Ex) `foo[0]: int = bar`
Stmt::AnnAssign(ast::StmtAnnAssign { target, .. }) => {
if let Some(ast::ExprSubscript { value: target, .. }) = target.as_subscript_expr() {
if let Some(ast::ExprName { id: target_id, .. }) = target.as_name_expr() {
return target_id == id;
}
}
false
}
// Ex) `del foo[0]`
Stmt::Delete(ast::StmtDelete { targets, .. }) => targets.iter().any(|target| {
if let Some(ast::ExprSubscript { value: target, .. }) = target.as_subscript_expr() {
if let Some(ast::ExprName { id: target_id, .. }) = target.as_name_expr() {
return target_id == id;
}
}
false
}),
_ => false,
}
}

View File

@@ -178,7 +178,7 @@ PERF101.py:34:10: PERF101 [*] Do not cast an iterable to `list` before iterating
34 |+for i in {1, 2, 3}: # PERF101
37 35 | pass
38 36 |
39 37 | for i in list(foo_dict): # Ok
39 37 | for i in list(foo_dict): # OK
PERF101.py:57:10: PERF101 [*] Do not cast an iterable to `list` before iterating over it
|
@@ -192,7 +192,7 @@ PERF101.py:57:10: PERF101 [*] Do not cast an iterable to `list` before iterating
= help: Remove `list()` cast
Safe fix
54 54 | for i in list(foo_list): # Ok
54 54 | for i in list(foo_list): # OK
55 55 | foo_list.append(i + 1)
56 56 |
57 |-for i in list(foo_list): # PERF101
@@ -218,5 +218,7 @@ PERF101.py:69:10: PERF101 [*] Do not cast an iterable to `list` before iterating
69 |-for i in list(nested_tuple): # PERF101
69 |+for i in nested_tuple: # PERF101
70 70 | pass
71 71 |
72 72 | for i in list(foo_list): # OK

View File

@@ -87,12 +87,17 @@ impl Violation for IndentWithSpaces {
/// """
/// ```
///
/// ## Formatter compatibility
/// We recommend against using this rule alongside the [formatter]. The
/// formatter enforces consistent indentation, making the rule redundant.
///
/// ## References
/// - [PEP 257 Docstring Conventions](https://peps.python.org/pep-0257/)
/// - [NumPy Style Guide](https://numpydoc.readthedocs.io/en/latest/format.html)
/// - [Google Python Style Guide - Docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings)
///
/// [PEP 257]: https://peps.python.org/pep-0257/
/// [formatter]: https://docs.astral.sh/ruff/formatter/
#[violation]
pub struct UnderIndentation;

View File

@@ -27,10 +27,16 @@ use crate::docstrings::Docstring;
/// """Return the pathname of the KOS root directory."""
/// ```
///
/// ## Formatter compatibility
/// We recommend against using this rule alongside the [formatter]. The
/// formatter enforces consistent quotes, making the rule redundant.
///
/// ## References
/// - [PEP 257 Docstring Conventions](https://peps.python.org/pep-0257/)
/// - [NumPy Style Guide](https://numpydoc.readthedocs.io/en/latest/format.html)
/// - [Google Python Style Guide - Docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings)
///
/// [formatter]: https://docs.astral.sh/ruff/formatter/
#[violation]
pub struct TripleSingleQuotes {
expected_quote: Quote,

View File

@@ -28,6 +28,15 @@ enum UnusedImportContext {
/// If an import statement is used to check for the availability or existence
/// of a module, consider using `importlib.util.find_spec` instead.
///
/// If an import statement is used to re-export a symbol as part of a module's
/// public interface, consider using a "redundant" import alias, which
/// instructs Ruff (and other tools) to respect the re-export, and avoid
/// marking it as unused, as in:
///
/// ```python
/// from module import member as member
/// ```
///
/// ## Example
/// ```python
/// import numpy as np # unused import
@@ -54,11 +63,12 @@ enum UnusedImportContext {
/// ```
///
/// ## Options
/// - `lint.pyflakes.extend-generics`
/// - `lint.ignore-init-module-imports`
///
/// ## References
/// - [Python documentation: `import`](https://docs.python.org/3/reference/simple_stmts.html#the-import-statement)
/// - [Python documentation: `importlib.util.find_spec`](https://docs.python.org/3/library/importlib.html#importlib.util.find_spec)
/// - [Typing documentation: interface conventions](https://typing.readthedocs.io/en/latest/source/libraries.html#library-interface-public-and-private-symbols)
#[violation]
pub struct UnusedImport {
name: String,

View File

@@ -1,6 +1,7 @@
use ruff_diagnostics::{AlwaysFixableViolation, Diagnostic, Edit, Fix};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{self as ast, CmpOp, Expr};
use ruff_python_semantic::analyze::typing;
use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;
@@ -25,7 +26,8 @@ use crate::checkers::ast::Checker;
/// ## Fix safety
/// This rule's fix is marked as unsafe, as the use of a `set` literal will
/// error at runtime if the sequence contains unhashable elements (like lists
/// or dictionaries).
/// or dictionaries). While Ruff will attempt to infer the hashability of the
/// elements, it may not always be able to do so.
///
/// ## References
/// - [Whats New In Python 3.2](https://docs.python.org/3/whatsnew/3.2.html#optimizations)
@@ -57,7 +59,40 @@ pub(crate) fn literal_membership(checker: &mut Checker, compare: &ast::ExprCompa
return;
};
if !matches!(right, Expr::List(_) | Expr::Tuple(_)) {
let elts = match right {
Expr::List(ast::ExprList { elts, .. }) => elts,
Expr::Tuple(ast::ExprTuple { elts, .. }) => elts,
_ => return,
};
// If `left`, or any of the elements in `right`, are known to _not_ be hashable, return.
if std::iter::once(compare.left.as_ref())
.chain(elts)
.any(|expr| match expr {
// Expressions that are known _not_ to be hashable.
Expr::List(_)
| Expr::Set(_)
| Expr::Dict(_)
| Expr::ListComp(_)
| Expr::SetComp(_)
| Expr::DictComp(_)
| Expr::GeneratorExp(_)
| Expr::Await(_)
| Expr::Yield(_)
| Expr::YieldFrom(_) => true,
// Expressions that can be _inferred_ not to be hashable.
Expr::Name(name) => {
let Some(id) = checker.semantic().resolve_name(name) else {
return false;
};
let binding = checker.semantic().binding(id);
typing::is_list(binding, checker.semantic())
|| typing::is_dict(binding, checker.semantic())
|| typing::is_set(binding, checker.semantic())
}
_ => false,
})
{
return;
}

View File

@@ -86,8 +86,10 @@ fn is_magic_value(literal_expr: LiteralExpressionRef, allowed_types: &[ConstantT
!matches!(value.to_str(), "" | "__main__")
}
LiteralExpressionRef::NumberLiteral(ast::ExprNumberLiteral { value, .. }) => match value {
#[allow(clippy::float_cmp)]
ast::Number::Float(value) => !(*value == 0.0 || *value == 1.0),
ast::Number::Int(value) => !matches!(*value, Int::ZERO | Int::ONE),
_ => true,
ast::Number::Complex { .. } => true,
},
LiteralExpressionRef::BytesLiteral(_) => true,
}

View File

@@ -52,6 +52,7 @@ pub(crate) fn non_ascii_name(binding: &Binding, locator: &Locator) -> Option<Dia
BindingKind::Assignment => Kind::Assignment,
BindingKind::TypeParam => Kind::TypeParam,
BindingKind::LoopVar => Kind::LoopVar,
BindingKind::ComprehensionVar => Kind::ComprenhensionVar,
BindingKind::WithItemVar => Kind::WithItemVar,
BindingKind::Global => Kind::Global,
BindingKind::Nonlocal(_) => Kind::Nonlocal,
@@ -88,6 +89,7 @@ enum Kind {
Assignment,
TypeParam,
LoopVar,
ComprenhensionVar,
WithItemVar,
Global,
Nonlocal,
@@ -105,6 +107,7 @@ impl fmt::Display for Kind {
Kind::Assignment => f.write_str("Variable"),
Kind::TypeParam => f.write_str("Type parameter"),
Kind::LoopVar => f.write_str("Variable"),
Kind::ComprenhensionVar => f.write_str("Variable"),
Kind::WithItemVar => f.write_str("Variable"),
Kind::Global => f.write_str("Global"),
Kind::Nonlocal => f.write_str("Nonlocal"),

View File

@@ -10,49 +10,67 @@ magic_value_comparison.py:5:4: PLR2004 Magic value used in comparison, consider
6 | pass
|
magic_value_comparison.py:38:12: PLR2004 Magic value used in comparison, consider replacing `2` with a constant variable
magic_value_comparison.py:47:12: PLR2004 Magic value used in comparison, consider replacing `2` with a constant variable
|
36 | pass
37 |
38 | if argc != 2: # [magic-value-comparison]
| ^ PLR2004
39 | pass
|
magic_value_comparison.py:41:12: PLR2004 Magic value used in comparison, consider replacing `-2` with a constant variable
|
39 | pass
40 |
41 | if argc != -2: # [magic-value-comparison]
| ^^ PLR2004
42 | pass
|
magic_value_comparison.py:44:12: PLR2004 Magic value used in comparison, consider replacing `+2` with a constant variable
|
42 | pass
43 |
44 | if argc != +2: # [magic-value-comparison]
| ^^ PLR2004
45 | pass
46 |
47 | if argc != 2: # [magic-value-comparison]
| ^ PLR2004
48 | pass
|
magic_value_comparison.py:65:21: PLR2004 Magic value used in comparison, consider replacing `3.141592653589793238` with a constant variable
magic_value_comparison.py:50:12: PLR2004 Magic value used in comparison, consider replacing `-2` with a constant variable
|
63 | pi_estimation = 3.14
64 |
65 | if pi_estimation == 3.141592653589793238: # [magic-value-comparison]
48 | pass
49 |
50 | if argc != -2: # [magic-value-comparison]
| ^^ PLR2004
51 | pass
|
magic_value_comparison.py:53:12: PLR2004 Magic value used in comparison, consider replacing `+2` with a constant variable
|
51 | pass
52 |
53 | if argc != +2: # [magic-value-comparison]
| ^^ PLR2004
54 | pass
|
magic_value_comparison.py:56:12: PLR2004 Magic value used in comparison, consider replacing `-2.0` with a constant variable
|
54 | pass
55 |
56 | if argc != -2.0: # [magic-value-comparison]
| ^^^^ PLR2004
57 | pass
|
magic_value_comparison.py:59:12: PLR2004 Magic value used in comparison, consider replacing `+2.0` with a constant variable
|
57 | pass
58 |
59 | if argc != +2.0: # [magic-value-comparison]
| ^^^^ PLR2004
60 | pass
|
magic_value_comparison.py:80:21: PLR2004 Magic value used in comparison, consider replacing `3.141592653589793238` with a constant variable
|
78 | pi_estimation = 3.14
79 |
80 | if pi_estimation == 3.141592653589793238: # [magic-value-comparison]
| ^^^^^^^^^^^^^^^^^^^^ PLR2004
66 | pass
81 | pass
|
magic_value_comparison.py:71:21: PLR2004 Magic value used in comparison, consider replacing `0x3` with a constant variable
magic_value_comparison.py:86:21: PLR2004 Magic value used in comparison, consider replacing `0x3` with a constant variable
|
69 | pass
70 |
71 | if pi_estimation == 0x3: # [magic-value-comparison]
84 | pass
85 |
86 | if pi_estimation == 0x3: # [magic-value-comparison]
| ^^^ PLR2004
72 | pass
87 | pass
|

View File

@@ -48,8 +48,8 @@ literal_membership.py:4:6: PLR6201 [*] Use a `set` literal when testing for memb
5 | | 1, 2, 3
6 | | )
| |_^ PLR6201
7 |
8 | # OK
7 | fruits = ["cherry", "grapes"]
8 | "cherry" in fruits
|
= help: Convert to `set`
@@ -62,8 +62,29 @@ literal_membership.py:4:6: PLR6201 [*] Use a `set` literal when testing for memb
5 5 | 1, 2, 3
6 |-)
6 |+}
7 7 |
8 8 | # OK
9 9 | fruits = ["cherry", "grapes"]
7 7 | fruits = ["cherry", "grapes"]
8 8 | "cherry" in fruits
9 9 | _ = {key: value for key, value in {"a": 1, "b": 2}.items() if key in ("a", "b")}
literal_membership.py:9:70: PLR6201 [*] Use a `set` literal when testing for membership
|
7 | fruits = ["cherry", "grapes"]
8 | "cherry" in fruits
9 | _ = {key: value for key, value in {"a": 1, "b": 2}.items() if key in ("a", "b")}
| ^^^^^^^^^^ PLR6201
10 |
11 | # OK
|
= help: Convert to `set`
Unsafe fix
6 6 | )
7 7 | fruits = ["cherry", "grapes"]
8 8 | "cherry" in fruits
9 |-_ = {key: value for key, value in {"a": 1, "b": 2}.items() if key in ("a", "b")}
9 |+_ = {key: value for key, value in {"a": 1, "b": 2}.items() if key in {"a", "b"}}
10 10 |
11 11 | # OK
12 12 | fruits in [[1, 2, 3], [4, 5, 6]]

View File

@@ -1,31 +1,49 @@
---
source: crates/ruff_linter/src/rules/pylint/mod.rs
---
magic_value_comparison.py:59:22: PLR2004 Magic value used in comparison, consider replacing `"Hunter2"` with a constant variable
magic_value_comparison.py:56:12: PLR2004 Magic value used in comparison, consider replacing `-2.0` with a constant variable
|
54 | pass
55 |
56 | if argc != -2.0: # [magic-value-comparison]
| ^^^^ PLR2004
57 | pass
|
magic_value_comparison.py:59:12: PLR2004 Magic value used in comparison, consider replacing `+2.0` with a constant variable
|
57 | pass
58 |
59 | if input_password == "Hunter2": # correct
| ^^^^^^^^^ PLR2004
59 | if argc != +2.0: # [magic-value-comparison]
| ^^^^ PLR2004
60 | pass
|
magic_value_comparison.py:65:21: PLR2004 Magic value used in comparison, consider replacing `3.141592653589793238` with a constant variable
magic_value_comparison.py:74:22: PLR2004 Magic value used in comparison, consider replacing `"Hunter2"` with a constant variable
|
63 | pi_estimation = 3.14
64 |
65 | if pi_estimation == 3.141592653589793238: # [magic-value-comparison]
72 | pass
73 |
74 | if input_password == "Hunter2": # correct
| ^^^^^^^^^ PLR2004
75 | pass
|
magic_value_comparison.py:80:21: PLR2004 Magic value used in comparison, consider replacing `3.141592653589793238` with a constant variable
|
78 | pi_estimation = 3.14
79 |
80 | if pi_estimation == 3.141592653589793238: # [magic-value-comparison]
| ^^^^^^^^^^^^^^^^^^^^ PLR2004
66 | pass
81 | pass
|
magic_value_comparison.py:77:18: PLR2004 Magic value used in comparison, consider replacing `b"something"` with a constant variable
magic_value_comparison.py:92:18: PLR2004 Magic value used in comparison, consider replacing `b"something"` with a constant variable
|
75 | user_input = b"Hello, There!"
76 |
77 | if user_input == b"something": # correct
90 | user_input = b"Hello, There!"
91 |
92 | if user_input == b"something": # correct
| ^^^^^^^^^^^^ PLR2004
78 | pass
93 | pass
|

View File

@@ -17,6 +17,7 @@ mod tests {
#[test_case(Rule::ReadWholeFile, Path::new("FURB101.py"))]
#[test_case(Rule::RepeatedAppend, Path::new("FURB113.py"))]
#[test_case(Rule::ReimplementedOperator, Path::new("FURB118.py"))]
#[test_case(Rule::ReadlinesInFor, Path::new("FURB129.py"))]
#[test_case(Rule::DeleteFullSlice, Path::new("FURB131.py"))]
#[test_case(Rule::CheckAndRemoveFromSet, Path::new("FURB132.py"))]
#[test_case(Rule::IfExprMinMax, Path::new("FURB136.py"))]

View File

@@ -9,6 +9,7 @@ pub(crate) use math_constant::*;
pub(crate) use metaclass_abcmeta::*;
pub(crate) use print_empty_string::*;
pub(crate) use read_whole_file::*;
pub(crate) use readlines_in_for::*;
pub(crate) use redundant_log_base::*;
pub(crate) use regex_flag_alias::*;
pub(crate) use reimplemented_operator::*;
@@ -30,6 +31,7 @@ mod math_constant;
mod metaclass_abcmeta;
mod print_empty_string;
mod read_whole_file;
mod readlines_in_for;
mod redundant_log_base;
mod regex_flag_alias;
mod reimplemented_operator;

View File

@@ -0,0 +1,92 @@
use ruff_diagnostics::{AlwaysFixableViolation, Diagnostic, Edit, Fix};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{Comprehension, Expr, StmtFor};
use ruff_python_semantic::analyze::typing;
use ruff_python_semantic::analyze::typing::is_io_base_expr;
use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;
/// ## What it does
/// Checks for uses of `readlines()` when iterating over a file line-by-line.
///
/// ## Why is this bad?
/// Rather than iterating over all lines in a file by calling `readlines()`,
/// it's more convenient and performant to iterate over the file object
/// directly.
///
/// ## Example
/// ```python
/// with open("file.txt") as fp:
/// for line in fp.readlines():
/// ...
/// ```
///
/// Use instead:
/// ```python
/// with open("file.txt") as fp:
/// for line in fp:
/// ...
/// ```
///
/// ## References
/// - [Python documentation: `io.IOBase.readlines`](https://docs.python.org/3/library/io.html#io.IOBase.readlines)
#[violation]
pub(crate) struct ReadlinesInFor;
impl AlwaysFixableViolation for ReadlinesInFor {
#[derive_message_formats]
fn message(&self) -> String {
format!("Instead of calling `readlines()`, iterate over file object directly")
}
fn fix_title(&self) -> String {
"Remove `readlines()`".into()
}
}
/// FURB129
pub(crate) fn readlines_in_for(checker: &mut Checker, for_stmt: &StmtFor) {
readlines_in_iter(checker, for_stmt.iter.as_ref());
}
/// FURB129
pub(crate) fn readlines_in_comprehension(checker: &mut Checker, comprehension: &Comprehension) {
readlines_in_iter(checker, &comprehension.iter);
}
fn readlines_in_iter(checker: &mut Checker, iter_expr: &Expr) {
let Expr::Call(expr_call) = iter_expr else {
return;
};
let Expr::Attribute(expr_attr) = expr_call.func.as_ref() else {
return;
};
if expr_attr.attr.as_str() != "readlines" || !expr_call.arguments.is_empty() {
return;
}
// Determine whether `fp` in `fp.readlines()` was bound to a file object.
if let Expr::Name(name) = expr_attr.value.as_ref() {
if !checker
.semantic()
.resolve_name(name)
.map(|id| checker.semantic().binding(id))
.is_some_and(|binding| typing::is_io_base(binding, checker.semantic()))
{
return;
}
} else {
if !is_io_base_expr(expr_attr.value.as_ref(), checker.semantic()) {
return;
}
}
let mut diagnostic = Diagnostic::new(ReadlinesInFor, expr_call.range());
diagnostic.set_fix(Fix::unsafe_edit(Edit::range_deletion(
expr_call.range().add_start(expr_attr.value.range().len()),
)));
checker.diagnostics.push(diagnostic);
}

View File

@@ -0,0 +1,207 @@
---
source: crates/ruff_linter/src/rules/refurb/mod.rs
---
FURB129.py:7:18: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
5 | # Errors
6 | with open("FURB129.py") as f:
7 | for _line in f.readlines():
| ^^^^^^^^^^^^^ FURB129
8 | pass
9 | a = [line.lower() for line in f.readlines()]
|
= help: Remove `readlines()`
Unsafe fix
4 4 |
5 5 | # Errors
6 6 | with open("FURB129.py") as f:
7 |- for _line in f.readlines():
7 |+ for _line in f:
8 8 | pass
9 9 | a = [line.lower() for line in f.readlines()]
10 10 | b = {line.upper() for line in f.readlines()}
FURB129.py:9:35: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
7 | for _line in f.readlines():
8 | pass
9 | a = [line.lower() for line in f.readlines()]
| ^^^^^^^^^^^^^ FURB129
10 | b = {line.upper() for line in f.readlines()}
11 | c = {line.lower(): line.upper() for line in f.readlines()}
|
= help: Remove `readlines()`
Unsafe fix
6 6 | with open("FURB129.py") as f:
7 7 | for _line in f.readlines():
8 8 | pass
9 |- a = [line.lower() for line in f.readlines()]
9 |+ a = [line.lower() for line in f]
10 10 | b = {line.upper() for line in f.readlines()}
11 11 | c = {line.lower(): line.upper() for line in f.readlines()}
12 12 |
FURB129.py:10:35: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
8 | pass
9 | a = [line.lower() for line in f.readlines()]
10 | b = {line.upper() for line in f.readlines()}
| ^^^^^^^^^^^^^ FURB129
11 | c = {line.lower(): line.upper() for line in f.readlines()}
|
= help: Remove `readlines()`
Unsafe fix
7 7 | for _line in f.readlines():
8 8 | pass
9 9 | a = [line.lower() for line in f.readlines()]
10 |- b = {line.upper() for line in f.readlines()}
10 |+ b = {line.upper() for line in f}
11 11 | c = {line.lower(): line.upper() for line in f.readlines()}
12 12 |
13 13 | with Path("FURB129.py").open() as f:
FURB129.py:11:49: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
9 | a = [line.lower() for line in f.readlines()]
10 | b = {line.upper() for line in f.readlines()}
11 | c = {line.lower(): line.upper() for line in f.readlines()}
| ^^^^^^^^^^^^^ FURB129
12 |
13 | with Path("FURB129.py").open() as f:
|
= help: Remove `readlines()`
Unsafe fix
8 8 | pass
9 9 | a = [line.lower() for line in f.readlines()]
10 10 | b = {line.upper() for line in f.readlines()}
11 |- c = {line.lower(): line.upper() for line in f.readlines()}
11 |+ c = {line.lower(): line.upper() for line in f}
12 12 |
13 13 | with Path("FURB129.py").open() as f:
14 14 | for _line in f.readlines():
FURB129.py:14:18: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
13 | with Path("FURB129.py").open() as f:
14 | for _line in f.readlines():
| ^^^^^^^^^^^^^ FURB129
15 | pass
|
= help: Remove `readlines()`
Unsafe fix
11 11 | c = {line.lower(): line.upper() for line in f.readlines()}
12 12 |
13 13 | with Path("FURB129.py").open() as f:
14 |- for _line in f.readlines():
14 |+ for _line in f:
15 15 | pass
16 16 |
17 17 | for _line in open("FURB129.py").readlines():
FURB129.py:17:14: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
15 | pass
16 |
17 | for _line in open("FURB129.py").readlines():
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FURB129
18 | pass
|
= help: Remove `readlines()`
Unsafe fix
14 14 | for _line in f.readlines():
15 15 | pass
16 16 |
17 |-for _line in open("FURB129.py").readlines():
17 |+for _line in open("FURB129.py"):
18 18 | pass
19 19 |
20 20 | for _line in Path("FURB129.py").open().readlines():
FURB129.py:20:14: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
18 | pass
19 |
20 | for _line in Path("FURB129.py").open().readlines():
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FURB129
21 | pass
|
= help: Remove `readlines()`
Unsafe fix
17 17 | for _line in open("FURB129.py").readlines():
18 18 | pass
19 19 |
20 |-for _line in Path("FURB129.py").open().readlines():
20 |+for _line in Path("FURB129.py").open():
21 21 | pass
22 22 |
23 23 |
FURB129.py:26:18: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
24 | def func():
25 | f = Path("FURB129.py").open()
26 | for _line in f.readlines():
| ^^^^^^^^^^^^^ FURB129
27 | pass
28 | f.close()
|
= help: Remove `readlines()`
Unsafe fix
23 23 |
24 24 | def func():
25 25 | f = Path("FURB129.py").open()
26 |- for _line in f.readlines():
26 |+ for _line in f:
27 27 | pass
28 28 | f.close()
29 29 |
FURB129.py:32:18: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
31 | def func(f: io.BytesIO):
32 | for _line in f.readlines():
| ^^^^^^^^^^^^^ FURB129
33 | pass
|
= help: Remove `readlines()`
Unsafe fix
29 29 |
30 30 |
31 31 | def func(f: io.BytesIO):
32 |- for _line in f.readlines():
32 |+ for _line in f:
33 33 | pass
34 34 |
35 35 |
FURB129.py:38:22: FURB129 [*] Instead of calling `readlines()`, iterate over file object directly
|
36 | def func():
37 | with (open("FURB129.py") as f, foo as bar):
38 | for _line in f.readlines():
| ^^^^^^^^^^^^^ FURB129
39 | pass
40 | for _line in bar.readlines():
|
= help: Remove `readlines()`
Unsafe fix
35 35 |
36 36 | def func():
37 37 | with (open("FURB129.py") as f, foo as bar):
38 |- for _line in f.readlines():
38 |+ for _line in f:
39 39 | pass
40 40 | for _line in bar.readlines():
41 41 | pass

View File

@@ -48,6 +48,7 @@ mod tests {
#[test_case(Rule::DefaultFactoryKwarg, Path::new("RUF026.py"))]
#[test_case(Rule::MissingFStringSyntax, Path::new("RUF027_0.py"))]
#[test_case(Rule::MissingFStringSyntax, Path::new("RUF027_1.py"))]
#[test_case(Rule::MissingFStringSyntax, Path::new("RUF027_2.py"))]
fn rules(rule_code: Rule, path: &Path) -> Result<()> {
let snapshot = format!("{}_{}", rule_code.noqa_code(), path.to_string_lossy());
let diagnostics = test_path(

View File

@@ -4,9 +4,11 @@ use bitflags::bitflags;
use ruff_diagnostics::{Diagnostic, DiagnosticKind, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::StringLike;
use ruff_source_file::Locator;
use ruff_text_size::{TextLen, TextRange, TextSize};
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};
use crate::checkers::ast::Checker;
use crate::registry::AsRule;
use crate::rules::ruff::rules::confusables::confusable;
use crate::rules::ruff::rules::Context;
@@ -171,16 +173,59 @@ impl Violation for AmbiguousUnicodeCharacterComment {
}
}
/// RUF001, RUF002, RUF003
pub(crate) fn ambiguous_unicode_character(
/// RUF003
pub(crate) fn ambiguous_unicode_character_comment(
diagnostics: &mut Vec<Diagnostic>,
locator: &Locator,
range: TextRange,
settings: &LinterSettings,
) {
let text = locator.slice(range);
ambiguous_unicode_character(diagnostics, text, range, Context::Comment, settings);
}
/// RUF001, RUF002
pub(crate) fn ambiguous_unicode_character_string(checker: &mut Checker, string_like: StringLike) {
let context = if checker.semantic().in_docstring() {
Context::Docstring
} else {
Context::String
};
match string_like {
StringLike::StringLiteral(string_literal) => {
for string in &string_literal.value {
let text = checker.locator().slice(string);
ambiguous_unicode_character(
&mut checker.diagnostics,
text,
string.range(),
context,
checker.settings,
);
}
}
StringLike::FStringLiteral(f_string_literal) => {
let text = checker.locator().slice(f_string_literal);
ambiguous_unicode_character(
&mut checker.diagnostics,
text,
f_string_literal.range(),
context,
checker.settings,
);
}
StringLike::BytesLiteral(_) => (),
}
}
fn ambiguous_unicode_character(
diagnostics: &mut Vec<Diagnostic>,
text: &str,
range: TextRange,
context: Context,
settings: &LinterSettings,
) {
let text = locator.slice(range);
// Most of the time, we don't need to check for ambiguous unicode characters at all.
if text.is_ascii() {
return;

View File

@@ -52,14 +52,15 @@ use ruff_text_size::Ranged;
/// - [The Python Standard Library](https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task)
#[violation]
pub struct AsyncioDanglingTask {
expr: String,
method: Method,
}
impl Violation for AsyncioDanglingTask {
#[derive_message_formats]
fn message(&self) -> String {
let AsyncioDanglingTask { method } = self;
format!("Store a reference to the return value of `asyncio.{method}`")
let AsyncioDanglingTask { expr, method } = self;
format!("Store a reference to the return value of `{expr}.{method}`")
}
}
@@ -80,23 +81,35 @@ pub(crate) fn asyncio_dangling_task(expr: &Expr, semantic: &SemanticModel) -> Op
})
{
return Some(Diagnostic::new(
AsyncioDanglingTask { method },
AsyncioDanglingTask {
expr: "asyncio".to_string(),
method,
},
expr.range(),
));
}
// Ex) `loop = asyncio.get_running_loop(); loop.create_task(...)`
// Ex) `loop = ...; loop.create_task(...)`
if let Expr::Attribute(ast::ExprAttribute { attr, value, .. }) = func.as_ref() {
if attr == "create_task" {
if typing::resolve_assignment(value, semantic).is_some_and(|call_path| {
matches!(call_path.as_slice(), ["asyncio", "get_running_loop"])
}) {
return Some(Diagnostic::new(
AsyncioDanglingTask {
method: Method::CreateTask,
},
expr.range(),
));
if let Expr::Name(name) = value.as_ref() {
if typing::resolve_assignment(value, semantic).is_some_and(|call_path| {
matches!(
call_path.as_slice(),
[
"asyncio",
"get_event_loop" | "get_running_loop" | "new_event_loop"
]
)
}) {
return Some(Diagnostic::new(
AsyncioDanglingTask {
expr: name.id.to_string(),
method: Method::CreateTask,
},
expr.range(),
));
}
}
}
}

View File

@@ -88,8 +88,10 @@ fn should_be_fstring(
return false;
}
let Ok(ast::Expr::FString(ast::ExprFString { value, .. })) =
parse_expression(&format!("f{}", locator.slice(literal.range())))
let fstring_expr = format!("f{}", locator.slice(literal));
// Note: Range offsets for `value` are based on `fstring_expr`
let Ok(ast::Expr::FString(ast::ExprFString { value, .. })) = parse_expression(&fstring_expr)
else {
return false;
};
@@ -159,7 +161,7 @@ fn should_be_fstring(
has_name = true;
}
if let Some(spec) = &element.format_spec {
let spec = locator.slice(spec.range());
let spec = &fstring_expr[spec.range()];
if FormatSpec::parse(spec).is_err() {
return false;
}

View File

@@ -25,7 +25,7 @@ RUF006.py:68:12: RUF006 Store a reference to the return value of `asyncio.create
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RUF006
|
RUF006.py:74:26: RUF006 Store a reference to the return value of `asyncio.create_task`
RUF006.py:74:26: RUF006 Store a reference to the return value of `loop.create_task`
|
72 | def f():
73 | loop = asyncio.get_running_loop()
@@ -33,7 +33,7 @@ RUF006.py:74:26: RUF006 Store a reference to the return value of `asyncio.create
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RUF006
|
RUF006.py:97:5: RUF006 Store a reference to the return value of `asyncio.create_task`
RUF006.py:97:5: RUF006 Store a reference to the return value of `loop.create_task`
|
95 | def f():
96 | loop = asyncio.get_running_loop()
@@ -41,4 +41,24 @@ RUF006.py:97:5: RUF006 Store a reference to the return value of `asyncio.create_
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RUF006
|
RUF006.py:170:5: RUF006 Store a reference to the return value of `loop.create_task`
|
168 | def f():
169 | loop = asyncio.new_event_loop()
170 | loop.create_task(main()) # Error
| ^^^^^^^^^^^^^^^^^^^^^^^^ RUF006
171 |
172 | # Error
|
RUF006.py:175:5: RUF006 Store a reference to the return value of `loop.create_task`
|
173 | def f():
174 | loop = asyncio.get_event_loop()
175 | loop.create_task(main()) # Error
| ^^^^^^^^^^^^^^^^^^^^^^^^ RUF006
176 |
177 | # OK
|

View File

@@ -285,6 +285,8 @@ RUF027_0.py:70:18: RUF027 [*] Possible f-string without an `f` prefix
69 | last = "Appleseed"
70 | value.method("{first} {last}") # RUF027
| ^^^^^^^^^^^^^^^^ RUF027
71 |
72 | def format_specifiers():
|
= help: Add `f` prefix
@@ -294,5 +296,24 @@ RUF027_0.py:70:18: RUF027 [*] Possible f-string without an `f` prefix
69 69 | last = "Appleseed"
70 |- value.method("{first} {last}") # RUF027
70 |+ value.method(f"{first} {last}") # RUF027
71 71 |
72 72 | def format_specifiers():
73 73 | a = 4
RUF027_0.py:74:9: RUF027 [*] Possible f-string without an `f` prefix
|
72 | def format_specifiers():
73 | a = 4
74 | b = "{a:b} {a:^5}"
| ^^^^^^^^^^^^^^ RUF027
|
= help: Add `f` prefix
Unsafe fix
71 71 |
72 72 | def format_specifiers():
73 73 | a = 4
74 |- b = "{a:b} {a:^5}"
74 |+ b = f"{a:b} {a:^5}"

View File

@@ -0,0 +1,4 @@
---
source: crates/ruff_linter/src/rules/ruff/mod.rs
---

View File

@@ -92,7 +92,9 @@ pub fn derive_message_formats(_attr: TokenStream, item: TokenStream) -> TokenStr
///
/// Good:
///
/// ```rust
/// ```ignroe
/// use ruff_macros::newtype_index;
///
/// #[newtype_index]
/// #[derive(Ord, PartialOrd)]
/// struct MyIndex;

View File

@@ -11,6 +11,7 @@ repository = { workspace = true }
license = { workspace = true }
[lib]
doctest = false
[dependencies]
ruff_diagnostics = { path = "../ruff_diagnostics" }

View File

@@ -935,7 +935,7 @@ where
}
}
/// A [`StatementVisitor`] that collects all `return` statements in a function or method.
/// A [`Visitor`] that collects all `return` statements in a function or method.
#[derive(Default)]
pub struct ReturnStatementVisitor<'a> {
pub returns: Vec<&'a ast::StmtReturn>,

View File

@@ -1011,7 +1011,7 @@ pub struct DebugText {
#[derive(Clone, Debug, PartialEq)]
pub struct ExprFString {
pub range: TextRange,
pub value: FStringValue,
inner: ExprFStringInner,
}
impl From<ExprFString> for Expr {
@@ -1020,17 +1020,12 @@ impl From<ExprFString> for Expr {
}
}
/// The value representing an [`ExprFString`].
#[derive(Clone, Debug, PartialEq)]
pub struct FStringValue {
inner: FStringValueInner,
}
impl FStringValue {
impl ExprFString {
/// Creates a new f-string with the given value.
pub fn single(value: FString) -> Self {
Self {
inner: FStringValueInner::Single(FStringPart::FString(value)),
range: value.range,
inner: ExprFStringInner::Single(FStringPart::FString(value)),
}
}
@@ -1040,31 +1035,32 @@ impl FStringValue {
/// # Panics
///
/// Panics if `values` is less than 2. Use [`FStringValue::single`] instead.
pub fn concatenated(values: Vec<FStringPart>) -> Self {
pub fn concatenated(values: Vec<FStringPart>, range: TextRange) -> Self {
assert!(values.len() > 1);
Self {
inner: FStringValueInner::Concatenated(values),
range,
inner: ExprFStringInner::Concatenated(values),
}
}
/// Returns `true` if the f-string is implicitly concatenated, `false` otherwise.
pub fn is_implicit_concatenated(&self) -> bool {
matches!(self.inner, FStringValueInner::Concatenated(_))
matches!(self.inner, ExprFStringInner::Concatenated(_))
}
/// Returns a slice of all the [`FStringPart`]s contained in this value.
pub fn as_slice(&self) -> &[FStringPart] {
match &self.inner {
FStringValueInner::Single(part) => std::slice::from_ref(part),
FStringValueInner::Concatenated(parts) => parts,
ExprFStringInner::Single(part) => std::slice::from_ref(part),
ExprFStringInner::Concatenated(parts) => parts,
}
}
/// Returns a mutable slice of all the [`FStringPart`]s contained in this value.
fn as_mut_slice(&mut self) -> &mut [FStringPart] {
match &mut self.inner {
FStringValueInner::Single(part) => std::slice::from_mut(part),
FStringValueInner::Concatenated(parts) => parts,
ExprFStringInner::Single(part) => std::slice::from_mut(part),
ExprFStringInner::Concatenated(parts) => parts,
}
}
@@ -1121,7 +1117,7 @@ impl FStringValue {
}
}
impl<'a> IntoIterator for &'a FStringValue {
impl<'a> IntoIterator for &'a ExprFString {
type Item = &'a FStringPart;
type IntoIter = Iter<'a, FStringPart>;
@@ -1130,9 +1126,10 @@ impl<'a> IntoIterator for &'a FStringValue {
}
}
impl<'a> IntoIterator for &'a mut FStringValue {
impl<'a> IntoIterator for &'a mut ExprFString {
type Item = &'a mut FStringPart;
type IntoIter = IterMut<'a, FStringPart>;
fn into_iter(self) -> Self::IntoIter {
self.iter_mut()
}
@@ -1140,7 +1137,7 @@ impl<'a> IntoIterator for &'a mut FStringValue {
/// An internal representation of [`FStringValue`].
#[derive(Clone, Debug, PartialEq)]
enum FStringValueInner {
enum ExprFStringInner {
/// A single f-string i.e., `f"foo"`.
///
/// This is always going to be `FStringPart::FString` variant which is
@@ -1182,11 +1179,7 @@ impl Ranged for FString {
impl From<FString> for Expr {
fn from(payload: FString) -> Self {
ExprFString {
range: payload.range,
value: FStringValue::single(payload),
}
.into()
Expr::FString(ExprFString::single(payload))
}
}
@@ -1210,7 +1203,7 @@ impl Ranged for FStringElement {
#[derive(Clone, Debug, Default, PartialEq)]
pub struct ExprStringLiteral {
pub range: TextRange,
pub value: StringLiteralValue,
inner: ExprStringLiteralInner,
}
impl From<ExprStringLiteral> for Expr {
@@ -1225,17 +1218,12 @@ impl Ranged for ExprStringLiteral {
}
}
/// The value representing a [`ExprStringLiteral`].
#[derive(Clone, Debug, Default, PartialEq)]
pub struct StringLiteralValue {
inner: StringLiteralValueInner,
}
impl StringLiteralValue {
impl ExprStringLiteral {
/// Creates a new single string literal with the given value.
pub fn single(string: StringLiteral) -> Self {
Self {
inner: StringLiteralValueInner::Single(string),
range: string.range,
inner: ExprStringLiteralInner::Single(string),
}
}
@@ -1246,10 +1234,11 @@ impl StringLiteralValue {
///
/// Panics if `strings` is less than 2. Use [`StringLiteralValue::single`]
/// instead.
pub fn concatenated(strings: Vec<StringLiteral>) -> Self {
pub fn concatenated(strings: Vec<StringLiteral>, range: TextRange) -> Self {
assert!(strings.len() > 1);
Self {
inner: StringLiteralValueInner::Concatenated(ConcatenatedStringLiteral {
range,
inner: ExprStringLiteralInner::Concatenated(ConcatenatedStringLiteral {
strings,
value: OnceCell::new(),
}),
@@ -1258,7 +1247,7 @@ impl StringLiteralValue {
/// Returns `true` if the string literal is implicitly concatenated.
pub const fn is_implicit_concatenated(&self) -> bool {
matches!(self.inner, StringLiteralValueInner::Concatenated(_))
matches!(self.inner, ExprStringLiteralInner::Concatenated(_))
}
/// Returns `true` if the string literal is a unicode string.
@@ -1272,16 +1261,16 @@ impl StringLiteralValue {
/// Returns a slice of all the [`StringLiteral`] parts contained in this value.
pub fn as_slice(&self) -> &[StringLiteral] {
match &self.inner {
StringLiteralValueInner::Single(value) => std::slice::from_ref(value),
StringLiteralValueInner::Concatenated(value) => value.strings.as_slice(),
ExprStringLiteralInner::Single(value) => std::slice::from_ref(value),
ExprStringLiteralInner::Concatenated(value) => value.strings.as_slice(),
}
}
/// Returns a mutable slice of all the [`StringLiteral`] parts contained in this value.
fn as_mut_slice(&mut self) -> &mut [StringLiteral] {
match &mut self.inner {
StringLiteralValueInner::Single(value) => std::slice::from_mut(value),
StringLiteralValueInner::Concatenated(value) => value.strings.as_mut_slice(),
ExprStringLiteralInner::Single(value) => std::slice::from_mut(value),
ExprStringLiteralInner::Concatenated(value) => value.strings.as_mut_slice(),
}
}
@@ -1318,13 +1307,13 @@ impl StringLiteralValue {
/// string value is implicitly concatenated.
pub fn to_str(&self) -> &str {
match &self.inner {
StringLiteralValueInner::Single(value) => value.as_str(),
StringLiteralValueInner::Concatenated(value) => value.to_str(),
ExprStringLiteralInner::Single(value) => value.as_str(),
ExprStringLiteralInner::Concatenated(value) => value.to_str(),
}
}
}
impl<'a> IntoIterator for &'a StringLiteralValue {
impl<'a> IntoIterator for &'a ExprStringLiteral {
type Item = &'a StringLiteral;
type IntoIter = Iter<'a, StringLiteral>;
@@ -1333,15 +1322,16 @@ impl<'a> IntoIterator for &'a StringLiteralValue {
}
}
impl<'a> IntoIterator for &'a mut StringLiteralValue {
impl<'a> IntoIterator for &'a mut ExprStringLiteral {
type Item = &'a mut StringLiteral;
type IntoIter = IterMut<'a, StringLiteral>;
fn into_iter(self) -> Self::IntoIter {
self.iter_mut()
}
}
impl PartialEq<str> for StringLiteralValue {
impl PartialEq<str> for ExprStringLiteral {
fn eq(&self, other: &str) -> bool {
if self.len() != other.len() {
return false;
@@ -1351,13 +1341,13 @@ impl PartialEq<str> for StringLiteralValue {
}
}
impl PartialEq<String> for StringLiteralValue {
impl PartialEq<String> for ExprStringLiteral {
fn eq(&self, other: &String) -> bool {
self == other.as_str()
}
}
impl fmt::Display for StringLiteralValue {
impl fmt::Display for ExprStringLiteral {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.write_str(self.to_str())
}
@@ -1365,7 +1355,7 @@ impl fmt::Display for StringLiteralValue {
/// An internal representation of [`StringLiteralValue`].
#[derive(Clone, Debug, PartialEq)]
enum StringLiteralValueInner {
enum ExprStringLiteralInner {
/// A single string literal i.e., `"foo"`.
Single(StringLiteral),
@@ -1373,7 +1363,7 @@ enum StringLiteralValueInner {
Concatenated(ConcatenatedStringLiteral),
}
impl Default for StringLiteralValueInner {
impl Default for ExprStringLiteralInner {
fn default() -> Self {
Self::Single(StringLiteral::default())
}
@@ -1411,11 +1401,7 @@ impl StringLiteral {
impl From<StringLiteral> for Expr {
fn from(payload: StringLiteral) -> Self {
ExprStringLiteral {
range: payload.range,
value: StringLiteralValue::single(payload),
}
.into()
Expr::StringLiteral(ExprStringLiteral::single(payload))
}
}
@@ -1466,7 +1452,7 @@ impl Debug for ConcatenatedStringLiteral {
#[derive(Clone, Debug, Default, PartialEq)]
pub struct ExprBytesLiteral {
pub range: TextRange,
pub value: BytesLiteralValue,
inner: ExprBytesLiteralInner,
}
impl From<ExprBytesLiteral> for Expr {
@@ -1481,17 +1467,12 @@ impl Ranged for ExprBytesLiteral {
}
}
/// The value representing a [`ExprBytesLiteral`].
#[derive(Clone, Debug, Default, PartialEq)]
pub struct BytesLiteralValue {
inner: BytesLiteralValueInner,
}
impl BytesLiteralValue {
impl ExprBytesLiteral {
/// Creates a new single bytes literal with the given value.
pub fn single(value: BytesLiteral) -> Self {
Self {
inner: BytesLiteralValueInner::Single(value),
range: value.range,
inner: ExprBytesLiteralInner::Single(value),
}
}
@@ -1502,31 +1483,32 @@ impl BytesLiteralValue {
///
/// Panics if `values` is less than 2. Use [`BytesLiteralValue::single`]
/// instead.
pub fn concatenated(values: Vec<BytesLiteral>) -> Self {
pub fn concatenated(values: Vec<BytesLiteral>, range: TextRange) -> Self {
assert!(values.len() > 1);
Self {
inner: BytesLiteralValueInner::Concatenated(values),
range,
inner: ExprBytesLiteralInner::Concatenated(values),
}
}
/// Returns `true` if the bytes literal is implicitly concatenated.
pub const fn is_implicit_concatenated(&self) -> bool {
matches!(self.inner, BytesLiteralValueInner::Concatenated(_))
matches!(self.inner, ExprBytesLiteralInner::Concatenated(_))
}
/// Returns a slice of all the [`BytesLiteral`] parts contained in this value.
pub fn as_slice(&self) -> &[BytesLiteral] {
match &self.inner {
BytesLiteralValueInner::Single(value) => std::slice::from_ref(value),
BytesLiteralValueInner::Concatenated(value) => value.as_slice(),
ExprBytesLiteralInner::Single(value) => std::slice::from_ref(value),
ExprBytesLiteralInner::Concatenated(value) => value.as_slice(),
}
}
/// Returns a mutable slice of all the [`BytesLiteral`] parts contained in this value.
fn as_mut_slice(&mut self) -> &mut [BytesLiteral] {
match &mut self.inner {
BytesLiteralValueInner::Single(value) => std::slice::from_mut(value),
BytesLiteralValueInner::Concatenated(value) => value.as_mut_slice(),
ExprBytesLiteralInner::Single(value) => std::slice::from_mut(value),
ExprBytesLiteralInner::Concatenated(value) => value.as_mut_slice(),
}
}
@@ -1557,7 +1539,7 @@ impl BytesLiteralValue {
}
}
impl<'a> IntoIterator for &'a BytesLiteralValue {
impl<'a> IntoIterator for &'a ExprBytesLiteral {
type Item = &'a BytesLiteral;
type IntoIter = Iter<'a, BytesLiteral>;
@@ -1566,15 +1548,16 @@ impl<'a> IntoIterator for &'a BytesLiteralValue {
}
}
impl<'a> IntoIterator for &'a mut BytesLiteralValue {
impl<'a> IntoIterator for &'a mut ExprBytesLiteral {
type Item = &'a mut BytesLiteral;
type IntoIter = IterMut<'a, BytesLiteral>;
fn into_iter(self) -> Self::IntoIter {
self.iter_mut()
}
}
impl PartialEq<[u8]> for BytesLiteralValue {
impl PartialEq<[u8]> for ExprBytesLiteral {
fn eq(&self, other: &[u8]) -> bool {
if self.len() != other.len() {
return false;
@@ -1588,7 +1571,7 @@ impl PartialEq<[u8]> for BytesLiteralValue {
/// An internal representation of [`BytesLiteralValue`].
#[derive(Clone, Debug, PartialEq)]
enum BytesLiteralValueInner {
enum ExprBytesLiteralInner {
/// A single bytes literal i.e., `b"foo"`.
Single(BytesLiteral),
@@ -1596,7 +1579,7 @@ enum BytesLiteralValueInner {
Concatenated(Vec<BytesLiteral>),
}
impl Default for BytesLiteralValueInner {
impl Default for ExprBytesLiteralInner {
fn default() -> Self {
Self::Single(BytesLiteral::default())
}
@@ -1633,11 +1616,7 @@ impl BytesLiteral {
impl From<BytesLiteral> for Expr {
fn from(payload: BytesLiteral) -> Self {
ExprBytesLiteral {
range: payload.range,
value: BytesLiteralValue::single(payload),
}
.into()
Expr::BytesLiteral(ExprBytesLiteral::single(payload))
}
}

View File

@@ -11,6 +11,7 @@ repository = { workspace = true }
license = { workspace = true }
[lib]
doctest = false
[dependencies]
ruff_python_ast = { path = "../ruff_python_ast" }

View File

@@ -1705,6 +1705,7 @@ class Foo:
assert_round_trip!(r#"f"{ chr(65) = !s}""#);
assert_round_trip!(r#"f"{ chr(65) = !r}""#);
assert_round_trip!(r#"f"{ chr(65) = :#x}""#);
assert_round_trip!(r#"f"{ ( chr(65) ) = }""#);
assert_round_trip!(r#"f"{a=!r:0.05f}""#);
}

View File

@@ -10,6 +10,9 @@ documentation = { workspace = true }
repository = { workspace = true }
license = { workspace = true }
[lib]
doctest= false
[dependencies]
ruff_cache = { path = "../ruff_cache" }
ruff_formatter = { path = "../ruff_formatter" }

View File

@@ -4,4 +4,8 @@ ij_formatter_enabled = false
["range_formatting/*.py"]
generated_code = true
ij_formatter_enabled = false
[docstring_tab_indentation.py]
generated_code = true
ij_formatter_enabled = false

View File

@@ -0,0 +1,10 @@
[
{
"indent_style": "tab",
"indent_width": 4
},
{
"indent_style": "tab",
"indent_width": 8
}
]

View File

@@ -0,0 +1,72 @@
# Tests the behavior of the formatter when it comes to tabs inside docstrings
# when using `indent_style="tab`
# The example below uses tabs exclusively. The formatter should preserve the tab indentation
# of `arg1`.
def tab_argument(arg1: str) -> None:
"""
Arguments:
arg1: super duper arg with 2 tabs in front
"""
# The `arg1` is intended with spaces. The formatter should not change the spaces to a tab
# because it must assume that the spaces are used for alignment and not indentation.
def space_argument(arg1: str) -> None:
"""
Arguments:
arg1: super duper arg with a tab and a space in front
"""
def under_indented(arg1: str) -> None:
"""
Arguments:
arg1: super duper arg with a tab and a space in front
arg2: Not properly indented
"""
def under_indented_tabs(arg1: str) -> None:
"""
Arguments:
arg1: super duper arg with a tab and a space in front
arg2: Not properly indented
"""
def spaces_tabs_over_indent(arg1: str) -> None:
"""
Arguments:
arg1: super duper arg with a tab and a space in front
"""
# The docstring itself is indented with spaces but the argument is indented by a tab.
# Keep the tab indentation of the argument, convert th docstring indent to tabs.
def space_indented_docstring_containing_tabs(arg1: str) -> None:
"""
Arguments:
arg1: super duper arg
"""
# The docstring uses tabs, spaces, tabs indentation.
# Fallback to use space indentation
def mixed_indentation(arg1: str) -> None:
"""
Arguments:
arg1: super duper arg with a tab and a space in front
"""
# The example shows an ascii art. The formatter should not change the spaces
# to tabs because it breaks the ASCII art when inspecting the docstring with `inspect.cleandoc(ascii_art.__doc__)`
# when using an indent width other than 8.
def ascii_art():
r"""
Look at this beautiful tree.
a
/ \
b c
/ \
d e
"""

View File

@@ -0,0 +1,8 @@
[
{
"preview": "enabled"
},
{
"preview": "disabled"
}
]

View File

@@ -30,22 +30,22 @@ result_f = (
# an expression inside a formatted value
(
f'{1}'
# comment
# comment 1
''
)
(
f'{1}' # comment
f'{1}' # comment 2
f'{2}'
)
(
f'{1}'
f'{2}' # comment
f'{2}' # comment 3
)
(
1, ( # comment
1, ( # comment 4
f'{2}'
)
)
@@ -53,7 +53,7 @@ result_f = (
(
(
f'{1}'
# comment
# comment 5
),
2
)
@@ -62,3 +62,221 @@ result_f = (
x = f'''a{""}b'''
y = f'''c{1}d"""e'''
z = f'''a{""}b''' f'''c{1}d"""e'''
# F-String formatting test cases (Preview)
# Simple expression with a mix of debug expression and comments.
x = f"{a}"
x = f"{
a = }"
x = f"{ # comment 6
a }"
x = f"{ # comment 7
a = }"
# Remove the parentheses as adding them doesn't make then fit within the line length limit.
# This is similar to how we format it before f-string formatting.
aaaaaaaaaaa = (
f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd } cccccccccc"
)
# Here, we would use the best fit layout to put the f-string indented on the next line
# similar to the next example.
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
aaaaaaaaaaa = (
f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc"
)
# This should never add the optional parentheses because even after adding them, the
# f-string exceeds the line length limit.
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { "bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { "bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" = } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { # comment 8
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" } ccccccccccccccc"
x = f"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa { # comment 9
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" = } ccccccccccccccc"
# Multiple larger expressions which exceeds the line length limit. Here, we need to decide
# whether to split at the first or second expression. This should work similarly to the
# assignment statement formatting where we split from right to left in preview mode.
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
# The above example won't split but when we start introducing line breaks:
x = f"aaaaaaaaaaaa {
bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb
} cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc {
ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb } cccccccccccccccccccc { ddddddddddddddd
} eeeeeeeeeeeeee"
# But, in case comments are present, we would split at the expression containing the
# comments:
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb # comment 10
} cccccccccccccccccccc { ddddddddddddddd } eeeeeeeeeeeeee"
x = f"aaaaaaaaaaaa { bbbbbbbbbbbbbb
} cccccccccccccccccccc { # comment 11
ddddddddddddddd } eeeeeeeeeeeeee"
# Here, the expression part itself starts with a curly brace so we need to add an extra
# space between the opening curly brace and the expression.
x = f"{ {'x': 1, 'y': 2} }"
# Although the extra space isn't required before the ending curly brace, we add it for
# consistency.
x = f"{ {'x': 1, 'y': 2}}"
x = f"{ {'x': 1, 'y': 2} = }"
x = f"{ # comment 12
{'x': 1, 'y': 2} }"
x = f"{ # comment 13
{'x': 1, 'y': 2} = }"
# But, in this case, we would split the expression itself because it exceeds the line
# length limit so we need not add the extra space.
xxxxxxx = f"{
{'aaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbb', 'ccccccccccccccccccccc'}
}"
# And, split the expression itself because it exceeds the line length.
xxxxxxx = f"{
{'aaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccccccccccccc'}
}"
# Quotes
f"foo 'bar' {x}"
f"foo \"bar\" {x}"
f'foo "bar" {x}'
f'foo \'bar\' {x}'
f"foo {"bar"}"
f"foo {'\'bar\''}"
# Here, the formatter will remove the escapes which is correct because they aren't allowed
# pre 3.12. This means we can assume that the f-string is used in the context of 3.12.
f"foo {'\"bar\"'}"
# Triple-quoted strings
# It's ok to use the same quote char for the inner string if it's single-quoted.
f"""test {'inner'}"""
f"""test {"inner"}"""
# But if the inner string is also triple-quoted then we should preserve the existing quotes.
f"""test {'''inner'''}"""
# Magic trailing comma
#
# The expression formatting will result in breaking it across multiple lines with a
# trailing comma but as the expression isn't already broken, we will remove all the line
# breaks which results in the trailing comma being present. This test case makes sure
# that the trailing comma is removed as well.
f"aaaaaaa {['aaaaaaaaaaaaaaa', 'bbbbbbbbbbbbb', 'ccccccccccccccccc', 'ddddddddddddddd', 'eeeeeeeeeeeeee']} aaaaaaa"
# And, if the trailing comma is already present, we still need to remove it.
f"aaaaaaa {['aaaaaaaaaaaaaaa', 'bbbbbbbbbbbbb', 'ccccccccccccccccc', 'ddddddddddddddd', 'eeeeeeeeeeeeee',]} aaaaaaa"
# Keep this Multiline by breaking it at the square brackets.
f"""aaaaaa {[
xxxxxxxx,
yyyyyyyy,
]} ccc"""
# Add the magic trailing comma because the elements don't fit within the line length limit
# when collapsed.
f"aaaaaa {[
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
xxxxxxxxxxxx,
yyyyyyyyyyyy
]} ccccccc"
# Remove the parenthese because they aren't required
xxxxxxxxxxxxxxx = (
f"aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbb {
xxxxxxxxxxx # comment 14
+ yyyyyyyyyy
} dddddddddd"
)
# Comments
# No comments should be dropped!
f"{ # comment 15
# comment 16
foo # comment 17
# comment 18
}" # comment 19
# comment 20
# Conversion flags
#
# This is not a valid Python code because of the additional whitespace between the `!`
# and conversion type. But, our parser isn't strict about this. This should probably be
# removed once we have a strict parser.
x = f"aaaaaaaaa { x ! r }"
# Even in the case of debug expresions, we only need to preserve the whitespace within
# the expression part of the replacement field.
x = f"aaaaaaaaa { x = ! r }"
# Combine conversion flags with format specifiers
x = f"{x = ! s
:>0
}"
# This is interesting. There can be a comment after the format specifier but only if it's
# on it's own line. Refer to https://github.com/astral-sh/ruff/pull/7787 for more details.
# We'll format is as trailing comments.
x = f"{x !s
:>0
# comment 21
}"
x = f"""
{ # comment 22
x = :.0{y # comment 23
}f}"""
# Here, the debug expression is in a nested f-string so we should start preserving
# whitespaces from that point onwards. This means we should format the outer f-string.
x = f"""{"foo " + # comment 24
f"{ x =
}" # comment 25
}
"""
# Mix of various features.
f"{ # comment 26
foo # after foo
:>{
x # after x
}
# comment 27
# comment 28
} woah {x}"
# Indentation
# What should be the indentation?
# https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590
if indent0:
if indent1:
if indent2:
foo = f"""hello world
hello {
f"aaaaaaa {
[
'aaaaaaaaaaaaaaaaaaaaa',
'bbbbbbbbbbbbbbbbbbbbb',
'ccccccccccccccccccccc',
'ddddddddddddddddddddd'
]
} bbbbbbbb" +
[
'aaaaaaaaaaaaaaaaaaaaa',
'bbbbbbbbbbbbbbbbbbbbb',
'ccccccccccccccccccccc',
'ddddddddddddddddddddd'
]
} --------
"""

View File

@@ -0,0 +1,5 @@
[
{
"target_version": "py312"
}
]

View File

@@ -0,0 +1,6 @@
# This file contains test cases only for cases where the logic tests for whether
# the target version is 3.12 or later. A user can have 3.12 syntax even if the target
# version isn't set.
# Quotes re-use
f"{'a'}"

View File

@@ -0,0 +1,8 @@
[
{
"source_type": "Ipynb"
},
{
"source_type": "Python"
}
]

View File

@@ -0,0 +1,6 @@
"""
This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
Ruff should leave it as is
""";
"another normal string"

View File

@@ -1,7 +1,7 @@
use ruff_formatter::{write, Argument, Arguments};
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::context::{NodeLevel, WithNodeLevel};
use crate::context::{FStringState, NodeLevel, WithNodeLevel};
use crate::other::commas::has_magic_trailing_comma;
use crate::prelude::*;
@@ -206,6 +206,16 @@ impl<'fmt, 'ast, 'buf> JoinCommaSeparatedBuilder<'fmt, 'ast, 'buf> {
pub(crate) fn finish(&mut self) -> FormatResult<()> {
self.result.and_then(|()| {
// If the formatter is inside an f-string expression element, and the layout
// is flat, then we don't need to add a trailing comma.
if let FStringState::InsideExpressionElement(context) =
self.fmt.context().f_string_state()
{
if context.layout().is_flat() {
return Ok(());
}
}
if let Some(last_end) = self.entries.position() {
let magic_trailing_comma = has_magic_trailing_comma(
TextRange::new(last_end, self.sequence_end),

View File

@@ -289,6 +289,28 @@ fn handle_enclosed_comment<'a>(
}
}
AnyNodeRef::FString(fstring) => CommentPlacement::dangling(fstring, comment),
AnyNodeRef::FStringExpressionElement(_) => {
// Handle comments after the format specifier (should be rare):
//
// ```python
// f"literal {
// expr:.3f
// # comment
// }"
// ```
//
// This is a valid comment placement.
if matches!(
comment.preceding_node(),
Some(
AnyNodeRef::FStringExpressionElement(_) | AnyNodeRef::FStringLiteralElement(_)
)
) {
CommentPlacement::trailing(comment.enclosing_node(), comment)
} else {
handle_bracketed_end_of_line_comment(comment, locator)
}
}
AnyNodeRef::ExprList(_)
| AnyNodeRef::ExprSet(_)
| AnyNodeRef::ExprListComp(_)

View File

@@ -1,4 +1,5 @@
use crate::comments::Comments;
use crate::other::f_string::FStringContext;
use crate::string::QuoteChar;
use crate::PyFormatOptions;
use ruff_formatter::{Buffer, FormatContext, GroupId, IndentWidth, SourceCode};
@@ -22,6 +23,8 @@ pub struct PyFormatContext<'a> {
/// quote style that is inverted from the one here in order to ensure that
/// the formatted Python code will be valid.
docstring: Option<QuoteChar>,
/// The state of the formatter with respect to f-strings.
f_string_state: FStringState,
}
impl<'a> PyFormatContext<'a> {
@@ -33,6 +36,7 @@ impl<'a> PyFormatContext<'a> {
node_level: NodeLevel::TopLevel(TopLevelStatementPosition::Other),
indent_level: IndentLevel::new(0),
docstring: None,
f_string_state: FStringState::Outside,
}
}
@@ -86,6 +90,14 @@ impl<'a> PyFormatContext<'a> {
}
}
pub(crate) fn f_string_state(&self) -> FStringState {
self.f_string_state
}
pub(crate) fn set_f_string_state(&mut self, f_string_state: FStringState) {
self.f_string_state = f_string_state;
}
/// Returns `true` if preview mode is enabled.
pub(crate) const fn is_preview(&self) -> bool {
self.options.preview().is_enabled()
@@ -115,6 +127,18 @@ impl Debug for PyFormatContext<'_> {
}
}
#[derive(Copy, Clone, Debug, Default)]
pub(crate) enum FStringState {
/// The formatter is inside an f-string expression element i.e., between the
/// curly brace in `f"foo {x}"`.
///
/// The containing `FStringContext` is the surrounding f-string context.
InsideExpressionElement(FStringContext),
/// The formatter is outside an f-string.
#[default]
Outside,
}
/// The position of a top-level statement in the module.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Default)]
pub(crate) enum TopLevelStatementPosition {
@@ -332,3 +356,65 @@ where
.set_indent_level(self.saved_level);
}
}
pub(crate) struct WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
buffer: D,
saved_location: FStringState,
}
impl<'a, B, D> WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
pub(crate) fn new(expr_location: FStringState, mut buffer: D) -> Self {
let context = buffer.state_mut().context_mut();
let saved_location = context.f_string_state();
context.set_f_string_state(expr_location);
Self {
buffer,
saved_location,
}
}
}
impl<'a, B, D> Deref for WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
type Target = B;
fn deref(&self) -> &Self::Target {
&self.buffer
}
}
impl<'a, B, D> DerefMut for WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.buffer
}
}
impl<'a, B, D> Drop for WithFStringState<'a, B, D>
where
D: DerefMut<Target = B>,
B: Buffer<Context = PyFormatContext<'a>>,
{
fn drop(&mut self) {
self.buffer
.state_mut()
.context_mut()
.set_f_string_state(self.saved_location);
}
}

View File

@@ -48,6 +48,24 @@ impl NeedsParentheses for ExprFString {
) -> OptionalParentheses {
if self.value.is_implicit_concatenated() {
OptionalParentheses::Multiline
// TODO(dhruvmanila): Ideally what we want here is a new variant which
// is something like:
// - If the expression fits by just adding the parentheses, then add them and
// avoid breaking the f-string expression. So,
// ```
// xxxxxxxxx = (
// f"aaaaaaaaaaaa { xxxxxxx + yyyyyyyy } bbbbbbbbbbbbb"
// )
// ```
// - But, if the expression is too long to fit even with parentheses, then
// don't add the parentheses and instead break the expression at `soft_line_break`.
// ```
// xxxxxxxxx = f"aaaaaaaaaaaa {
// xxxxxxxxx + yyyyyyyyyy
// } bbbbbbbbbbbbb"
// ```
// This isn't decided yet, refer to the relevant discussion:
// https://github.com/astral-sh/ruff/discussions/9785
} else if AnyString::FString(self).is_multiline(context.source()) {
OptionalParentheses::Never
} else {

View File

@@ -248,6 +248,12 @@ pub enum QuoteStyle {
Preserve,
}
impl QuoteStyle {
pub const fn is_preserve(self) -> bool {
matches!(self, QuoteStyle::Preserve)
}
}
impl fmt::Display for QuoteStyle {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
@@ -460,3 +466,12 @@ pub enum PythonVersion {
Py311,
Py312,
}
impl PythonVersion {
/// Return `true` if the current version supports [PEP 701].
///
/// [PEP 701]: https://peps.python.org/pep-0701/
pub fn supports_pep_701(self) -> bool {
self >= Self::Py312
}
}

View File

@@ -2,8 +2,7 @@ use ruff_python_ast::BytesLiteral;
use ruff_text_size::Ranged;
use crate::prelude::*;
use crate::preview::is_hex_codes_in_unicode_sequences_enabled;
use crate::string::{Quoting, StringPart};
use crate::string::{StringNormalizer, StringPart};
#[derive(Default)]
pub struct FormatBytesLiteral;
@@ -12,14 +11,9 @@ impl FormatNodeRule<BytesLiteral> for FormatBytesLiteral {
fn fmt_fields(&self, item: &BytesLiteral, f: &mut PyFormatter) -> FormatResult<()> {
let locator = f.context().locator();
StringPart::from_source(item.range(), &locator)
.normalize(
Quoting::CanChange,
&locator,
f.options().quote_style(),
f.context().docstring(),
is_hex_codes_in_unicode_sequences_enabled(f.context()),
)
StringNormalizer::from_context(f.context())
.with_preferred_quote_style(f.options().quote_style())
.normalize(&StringPart::from_source(item.range(), &locator), &locator)
.fmt(f)
}
}

View File

@@ -1,9 +1,13 @@
use ruff_formatter::write;
use ruff_python_ast::FString;
use ruff_source_file::Locator;
use ruff_text_size::Ranged;
use crate::prelude::*;
use crate::preview::is_hex_codes_in_unicode_sequences_enabled;
use crate::string::{Quoting, StringPart};
use crate::preview::is_f_string_formatting_enabled;
use crate::string::{Quoting, StringNormalizer, StringPart, StringPrefix, StringQuotes};
use super::f_string_element::FormatFStringElement;
/// Formats an f-string which is part of a larger f-string expression.
///
@@ -26,26 +30,126 @@ impl Format<PyFormatContext<'_>> for FormatFString<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
let locator = f.context().locator();
let result = StringPart::from_source(self.value.range(), &locator)
.normalize(
self.quoting,
&locator,
f.options().quote_style(),
f.context().docstring(),
is_hex_codes_in_unicode_sequences_enabled(f.context()),
let string = StringPart::from_source(self.value.range(), &locator);
let normalizer = StringNormalizer::from_context(f.context())
.with_quoting(self.quoting)
.with_preferred_quote_style(f.options().quote_style());
// If f-string formatting is disabled (not in preview), then we will
// fall back to the previous behavior of normalizing the f-string.
if !is_f_string_formatting_enabled(f.context()) {
let result = normalizer.normalize(&string, &locator).fmt(f);
let comments = f.context().comments();
self.value.elements.iter().for_each(|value| {
comments.mark_verbatim_node_comments_formatted(value.into());
// Above method doesn't mark the trailing comments of the f-string elements
// as formatted, so we need to do it manually. For example,
//
// ```python
// f"""foo {
// x:.3f
// # comment
// }"""
// ```
for trailing_comment in comments.trailing(value) {
trailing_comment.mark_formatted();
}
});
return result;
}
let quotes = normalizer.choose_quotes(&string, &locator);
let context = FStringContext::new(
string.prefix(),
quotes,
FStringLayout::from_f_string(self.value, &locator),
);
// Starting prefix and quote
write!(f, [string.prefix(), quotes])?;
f.join()
.entries(
self.value
.elements
.iter()
.map(|element| FormatFStringElement::new(element, context)),
)
.fmt(f);
.finish()?;
// TODO(dhruvmanila): With PEP 701, comments can be inside f-strings.
// This is to mark all of those comments as formatted but we need to
// figure out how to handle them. Note that this needs to be done only
// after the f-string is formatted, so only for all the non-formatted
// comments.
let comments = f.context().comments();
self.value.elements.iter().for_each(|value| {
comments.mark_verbatim_node_comments_formatted(value.into());
});
result
// Ending quote
quotes.fmt(f)
}
}
#[derive(Clone, Copy, Debug)]
pub(crate) struct FStringContext {
prefix: StringPrefix,
quotes: StringQuotes,
layout: FStringLayout,
}
impl FStringContext {
const fn new(prefix: StringPrefix, quotes: StringQuotes, layout: FStringLayout) -> Self {
Self {
prefix,
quotes,
layout,
}
}
pub(crate) const fn quotes(self) -> StringQuotes {
self.quotes
}
pub(crate) const fn prefix(self) -> StringPrefix {
self.prefix
}
pub(crate) const fn layout(self) -> FStringLayout {
self.layout
}
}
#[derive(Copy, Clone, Debug)]
pub(crate) enum FStringLayout {
/// Original f-string is flat.
/// Don't break expressions to keep the string flat.
Flat,
/// Original f-string has multiline expressions in the replacement fields.
/// Allow breaking expressions across multiple lines.
Multiline,
}
impl FStringLayout {
fn from_f_string(f_string: &FString, locator: &Locator) -> Self {
// Heuristic: Allow breaking the f-string expressions across multiple lines
// only if there already is at least one multiline expression. This puts the
// control in the hands of the user to decide if they want to break the
// f-string expressions across multiple lines or not. This is similar to
// how Prettier does it for template literals in JavaScript.
//
// If it's single quoted f-string and it contains a multiline expression, then we
// assume that the target version of Python supports it (3.12+). If there are comments
// used in any of the expression of the f-string, then it's always going to be multiline
// and we assume that the target version of Python supports it (3.12+).
//
// Reference: https://prettier.io/docs/en/next/rationale.html#template-literals
if f_string
.elements
.iter()
.filter_map(|element| element.as_expression())
.any(|expr| memchr::memchr2(b'\n', b'\r', locator.slice(expr).as_bytes()).is_some())
{
Self::Multiline
} else {
Self::Flat
}
}
pub(crate) const fn is_flat(self) -> bool {
matches!(self, Self::Flat)
}
}

View File

@@ -0,0 +1,244 @@
use std::borrow::Cow;
use ruff_formatter::{format_args, write, Buffer, RemoveSoftLinesBuffer};
use ruff_python_ast::{
ConversionFlag, Expr, FStringElement, FStringExpressionElement, FStringLiteralElement,
};
use ruff_text_size::Ranged;
use crate::comments::{dangling_open_parenthesis_comments, trailing_comments};
use crate::context::{FStringState, NodeLevel, WithFStringState, WithNodeLevel};
use crate::prelude::*;
use crate::preview::is_hex_codes_in_unicode_sequences_enabled;
use crate::string::normalize_string;
use crate::verbatim::verbatim_text;
use super::f_string::FStringContext;
/// Formats an f-string element which is either a literal or a formatted expression.
///
/// This delegates the actual formatting to the appropriate formatter.
pub(crate) struct FormatFStringElement<'a> {
element: &'a FStringElement,
context: FStringContext,
}
impl<'a> FormatFStringElement<'a> {
pub(crate) fn new(element: &'a FStringElement, context: FStringContext) -> Self {
Self { element, context }
}
}
impl Format<PyFormatContext<'_>> for FormatFStringElement<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
match self.element {
FStringElement::Literal(string_literal) => {
FormatFStringLiteralElement::new(string_literal, self.context).fmt(f)
}
FStringElement::Expression(expression) => {
FormatFStringExpressionElement::new(expression, self.context).fmt(f)
}
}
}
}
/// Formats an f-string literal element.
pub(crate) struct FormatFStringLiteralElement<'a> {
element: &'a FStringLiteralElement,
context: FStringContext,
}
impl<'a> FormatFStringLiteralElement<'a> {
pub(crate) fn new(element: &'a FStringLiteralElement, context: FStringContext) -> Self {
Self { element, context }
}
}
impl Format<PyFormatContext<'_>> for FormatFStringLiteralElement<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
let literal_content = f.context().locator().slice(self.element.range());
let normalized = normalize_string(
literal_content,
self.context.quotes(),
self.context.prefix(),
is_hex_codes_in_unicode_sequences_enabled(f.context()),
);
match &normalized {
Cow::Borrowed(_) => source_text_slice(self.element.range()).fmt(f),
Cow::Owned(normalized) => text(normalized).fmt(f),
}
}
}
/// Formats an f-string expression element.
pub(crate) struct FormatFStringExpressionElement<'a> {
element: &'a FStringExpressionElement,
context: FStringContext,
}
impl<'a> FormatFStringExpressionElement<'a> {
pub(crate) fn new(element: &'a FStringExpressionElement, context: FStringContext) -> Self {
Self { element, context }
}
}
impl Format<PyFormatContext<'_>> for FormatFStringExpressionElement<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
let FStringExpressionElement {
expression,
debug_text,
conversion,
format_spec,
..
} = self.element;
if let Some(debug_text) = debug_text {
token("{").fmt(f)?;
let comments = f.context().comments();
// If the element has a debug text, preserve the same formatting as
// in the source code (`verbatim`). This requires us to mark all of
// the surrounding comments as formatted.
comments.mark_verbatim_node_comments_formatted(self.element.into());
// Above method doesn't mark the leading and trailing comments of the element.
// There can't be any leading comments for an expression element, but there
// can be trailing comments. For example,
//
// ```python
// f"""foo {
// x:.3f
// # trailing comment
// }"""
// ```
for trailing_comment in comments.trailing(self.element) {
trailing_comment.mark_formatted();
}
write!(
f,
[
text(&debug_text.leading),
verbatim_text(&**expression),
text(&debug_text.trailing),
]
)?;
// Even if debug text is present, any whitespace between the
// conversion flag and the format spec doesn't need to be preserved.
match conversion {
ConversionFlag::Str => text("!s").fmt(f)?,
ConversionFlag::Ascii => text("!a").fmt(f)?,
ConversionFlag::Repr => text("!r").fmt(f)?,
ConversionFlag::None => (),
}
if let Some(format_spec) = format_spec.as_deref() {
write!(f, [token(":"), verbatim_text(format_spec)])?;
}
token("}").fmt(f)
} else {
let comments = f.context().comments().clone();
let dangling_item_comments = comments.dangling(self.element);
let item = format_with(|f| {
let bracket_spacing = match expression.as_ref() {
// If an expression starts with a `{`, we need to add a space before the
// curly brace to avoid turning it into a literal curly with `{{`.
//
// For example,
// ```python
// f"{ {'x': 1, 'y': 2} }"
// # ^ ^
// ```
//
// We need to preserve the space highlighted by `^`. The whitespace
// before the closing curly brace is not strictly necessary, but it's
// added to maintain consistency.
Expr::Dict(_) | Expr::DictComp(_) | Expr::Set(_) | Expr::SetComp(_) => {
Some(format_with(|f| {
if self.context.layout().is_flat() {
space().fmt(f)
} else {
soft_line_break_or_space().fmt(f)
}
}))
}
_ => None,
};
// Update the context to be inside the f-string expression element.
let f = &mut WithFStringState::new(
FStringState::InsideExpressionElement(self.context),
f,
);
write!(f, [bracket_spacing, expression.format()])?;
// Conversion comes first, then the format spec.
match conversion {
ConversionFlag::Str => text("!s").fmt(f)?,
ConversionFlag::Ascii => text("!a").fmt(f)?,
ConversionFlag::Repr => text("!r").fmt(f)?,
ConversionFlag::None => (),
}
if let Some(format_spec) = format_spec.as_deref() {
token(":").fmt(f)?;
f.join()
.entries(
format_spec
.elements
.iter()
.map(|element| FormatFStringElement::new(element, self.context)),
)
.finish()?;
// These trailing comments can only occur if the format specifier is
// present. For example,
//
// ```python
// f"{
// x:.3f
// # comment
// }"
// ```
//
// Any other trailing comments are attached to the expression itself.
trailing_comments(comments.trailing(self.element)).fmt(f)?;
}
bracket_spacing.fmt(f)
});
let open_parenthesis_comments = if dangling_item_comments.is_empty() {
None
} else {
Some(dangling_open_parenthesis_comments(dangling_item_comments))
};
token("{").fmt(f)?;
{
let mut f = WithNodeLevel::new(NodeLevel::ParenthesizedExpression, f);
if self.context.layout().is_flat() {
let mut buffer = RemoveSoftLinesBuffer::new(&mut *f);
write!(buffer, [open_parenthesis_comments, item])?;
} else {
group(&format_args![
open_parenthesis_comments,
soft_block_indent(&item)
])
.fmt(&mut f)?;
}
}
token("}").fmt(f)
}
}
}

View File

@@ -7,6 +7,7 @@ pub(crate) mod decorator;
pub(crate) mod elif_else_clause;
pub(crate) mod except_handler_except_handler;
pub(crate) mod f_string;
pub(crate) mod f_string_element;
pub(crate) mod f_string_part;
pub(crate) mod identifier;
pub(crate) mod keyword;

View File

@@ -2,8 +2,7 @@ use ruff_python_ast::StringLiteral;
use ruff_text_size::Ranged;
use crate::prelude::*;
use crate::preview::is_hex_codes_in_unicode_sequences_enabled;
use crate::string::{docstring, Quoting, StringPart};
use crate::string::{docstring, Quoting, StringNormalizer, StringPart};
use crate::QuoteStyle;
pub(crate) struct FormatStringLiteral<'a> {
@@ -50,20 +49,22 @@ impl Format<PyFormatContext<'_>> for FormatStringLiteral<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
let locator = f.context().locator();
let quote_style = if self.layout.is_docstring() {
// Per PEP 8 and PEP 257, always prefer double quotes for docstrings
let quote_style = f.options().quote_style();
let quote_style = if self.layout.is_docstring() && !quote_style.is_preserve() {
// Per PEP 8 and PEP 257, always prefer double quotes for docstrings,
// except when using quote-style=preserve
QuoteStyle::Double
} else {
f.options().quote_style()
quote_style
};
let normalized = StringPart::from_source(self.value.range(), &locator).normalize(
self.layout.quoting(),
&locator,
quote_style,
f.context().docstring(),
is_hex_codes_in_unicode_sequences_enabled(f.context()),
);
let normalized = StringNormalizer::from_context(f.context())
.with_quoting(self.layout.quoting())
.with_preferred_quote_style(quote_style)
.normalize(
&StringPart::from_source(self.value.range(), &locator),
&locator,
);
if self.layout.is_docstring() {
docstring::format(&normalized, f)

View File

@@ -81,3 +81,8 @@ pub(crate) const fn is_multiline_string_handling_enabled(context: &PyFormatConte
pub(crate) const fn is_format_module_docstring_enabled(context: &PyFormatContext) -> bool {
context.is_preview()
}
/// Returns `true` if the [`f-string formatting`](https://github.com/astral-sh/ruff/issues/7594) preview style is enabled.
pub(crate) fn is_f_string_formatting_enabled(context: &PyFormatContext) -> bool {
context.is_preview()
}

View File

@@ -214,9 +214,9 @@ impl<'ast> PreorderVisitor<'ast> for FindEnclosingNode<'_, 'ast> {
// Don't pick potential docstrings as the closest enclosing node because `suite.rs` than fails to identify them as
// docstrings and docstring formatting won't kick in.
// Format the enclosing node instead and slice the formatted docstring from the result.
let is_maybe_docstring = node
.as_stmt_expr()
.is_some_and(|stmt| DocstringStmt::is_docstring_statement(stmt));
let is_maybe_docstring = node.as_stmt_expr().is_some_and(|stmt| {
DocstringStmt::is_docstring_statement(stmt, self.context.options().source_type())
});
if is_maybe_docstring {
return TraversalSignal::Skip;

View File

@@ -103,7 +103,9 @@ impl FormatRule<Suite, PyFormatContext<'_>> for FormatSuite {
}
SuiteKind::Function => {
if let Some(docstring) = DocstringStmt::try_from_statement(first, self.kind) {
if let Some(docstring) =
DocstringStmt::try_from_statement(first, self.kind, source_type)
{
SuiteChildStatement::Docstring(docstring)
} else {
SuiteChildStatement::Other(first)
@@ -111,7 +113,9 @@ impl FormatRule<Suite, PyFormatContext<'_>> for FormatSuite {
}
SuiteKind::Class => {
if let Some(docstring) = DocstringStmt::try_from_statement(first, self.kind) {
if let Some(docstring) =
DocstringStmt::try_from_statement(first, self.kind, source_type)
{
if !comments.has_leading(first)
&& lines_before(first.start(), source) > 1
&& !source_type.is_stub()
@@ -143,7 +147,9 @@ impl FormatRule<Suite, PyFormatContext<'_>> for FormatSuite {
}
SuiteKind::TopLevel => {
if is_format_module_docstring_enabled(f.context()) {
if let Some(docstring) = DocstringStmt::try_from_statement(first, self.kind) {
if let Some(docstring) =
DocstringStmt::try_from_statement(first, self.kind, source_type)
{
SuiteChildStatement::Docstring(docstring)
} else {
SuiteChildStatement::Other(first)
@@ -184,7 +190,8 @@ impl FormatRule<Suite, PyFormatContext<'_>> for FormatSuite {
true
} else if is_module_docstring_newlines_enabled(f.context())
&& self.kind == SuiteKind::TopLevel
&& DocstringStmt::try_from_statement(first.statement(), self.kind).is_some()
&& DocstringStmt::try_from_statement(first.statement(), self.kind, source_type)
.is_some()
{
// Only in preview mode, insert a newline after a module level docstring, but treat
// it as a docstring otherwise. See: https://github.com/psf/black/pull/3932.
@@ -734,7 +741,16 @@ pub(crate) struct DocstringStmt<'a> {
impl<'a> DocstringStmt<'a> {
/// Checks if the statement is a simple string that can be formatted as a docstring
fn try_from_statement(stmt: &'a Stmt, suite_kind: SuiteKind) -> Option<DocstringStmt<'a>> {
fn try_from_statement(
stmt: &'a Stmt,
suite_kind: SuiteKind,
source_type: PySourceType,
) -> Option<DocstringStmt<'a>> {
// Notebooks don't have a concept of modules, therefore, don't recognise the first string as the module docstring.
if source_type.is_ipynb() && suite_kind == SuiteKind::TopLevel {
return None;
}
let Stmt::Expr(ast::StmtExpr { value, .. }) = stmt else {
return None;
};
@@ -752,7 +768,11 @@ impl<'a> DocstringStmt<'a> {
}
}
pub(crate) fn is_docstring_statement(stmt: &StmtExpr) -> bool {
pub(crate) fn is_docstring_statement(stmt: &StmtExpr, source_type: PySourceType) -> bool {
if source_type.is_ipynb() {
return false;
}
if let Expr::StringLiteral(ast::ExprStringLiteral { value, .. }) = stmt.value.as_ref() {
!value.is_implicit_concatenated()
} else {

View File

@@ -0,0 +1,212 @@
use std::iter::FusedIterator;
use memchr::memchr2;
use ruff_python_ast::{
self as ast, AnyNodeRef, Expr, ExprBytesLiteral, ExprFString, ExprStringLiteral, ExpressionRef,
StringLiteral,
};
use ruff_source_file::Locator;
use ruff_text_size::{Ranged, TextLen, TextRange};
use crate::expression::expr_f_string::f_string_quoting;
use crate::other::f_string::FormatFString;
use crate::other::string_literal::{FormatStringLiteral, StringLiteralKind};
use crate::prelude::*;
use crate::string::{Quoting, StringPrefix, StringQuotes};
/// Represents any kind of string expression. This could be either a string,
/// bytes or f-string.
#[derive(Copy, Clone, Debug)]
pub(crate) enum AnyString<'a> {
String(&'a ExprStringLiteral),
Bytes(&'a ExprBytesLiteral),
FString(&'a ExprFString),
}
impl<'a> AnyString<'a> {
/// Creates a new [`AnyString`] from the given [`Expr`].
///
/// Returns `None` if the expression is not either a string, bytes or f-string.
pub(crate) fn from_expression(expression: &'a Expr) -> Option<AnyString<'a>> {
match expression {
Expr::StringLiteral(string) => Some(AnyString::String(string)),
Expr::BytesLiteral(bytes) => Some(AnyString::Bytes(bytes)),
Expr::FString(fstring) => Some(AnyString::FString(fstring)),
_ => None,
}
}
/// Returns `true` if the string is implicitly concatenated.
pub(crate) fn is_implicit_concatenated(self) -> bool {
match self {
Self::String(ExprStringLiteral { value, .. }) => value.is_implicit_concatenated(),
Self::Bytes(ExprBytesLiteral { value, .. }) => value.is_implicit_concatenated(),
Self::FString(ExprFString { value, .. }) => value.is_implicit_concatenated(),
}
}
/// Returns the quoting to be used for this string.
pub(super) fn quoting(self, locator: &Locator<'_>) -> Quoting {
match self {
Self::String(_) | Self::Bytes(_) => Quoting::CanChange,
Self::FString(f_string) => f_string_quoting(f_string, locator),
}
}
/// Returns a vector of all the [`AnyStringPart`] of this string.
pub(super) fn parts(self, quoting: Quoting) -> AnyStringPartsIter<'a> {
match self {
Self::String(ExprStringLiteral { value, .. }) => {
AnyStringPartsIter::String(value.iter())
}
Self::Bytes(ExprBytesLiteral { value, .. }) => AnyStringPartsIter::Bytes(value.iter()),
Self::FString(ExprFString { value, .. }) => {
AnyStringPartsIter::FString(value.iter(), quoting)
}
}
}
pub(crate) fn is_multiline(self, source: &str) -> bool {
match self {
AnyString::String(_) | AnyString::Bytes(_) => {
let contents = &source[self.range()];
let prefix = StringPrefix::parse(contents);
let quotes = StringQuotes::parse(
&contents[TextRange::new(prefix.text_len(), contents.text_len())],
);
quotes.is_some_and(StringQuotes::is_triple)
&& memchr2(b'\n', b'\r', contents.as_bytes()).is_some()
}
AnyString::FString(fstring) => {
memchr2(b'\n', b'\r', source[fstring.range].as_bytes()).is_some()
}
}
}
}
impl Ranged for AnyString<'_> {
fn range(&self) -> TextRange {
match self {
Self::String(expr) => expr.range(),
Self::Bytes(expr) => expr.range(),
Self::FString(expr) => expr.range(),
}
}
}
impl<'a> From<&AnyString<'a>> for AnyNodeRef<'a> {
fn from(value: &AnyString<'a>) -> Self {
match value {
AnyString::String(expr) => AnyNodeRef::ExprStringLiteral(expr),
AnyString::Bytes(expr) => AnyNodeRef::ExprBytesLiteral(expr),
AnyString::FString(expr) => AnyNodeRef::ExprFString(expr),
}
}
}
impl<'a> From<AnyString<'a>> for AnyNodeRef<'a> {
fn from(value: AnyString<'a>) -> Self {
AnyNodeRef::from(&value)
}
}
impl<'a> From<&AnyString<'a>> for ExpressionRef<'a> {
fn from(value: &AnyString<'a>) -> Self {
match value {
AnyString::String(expr) => ExpressionRef::StringLiteral(expr),
AnyString::Bytes(expr) => ExpressionRef::BytesLiteral(expr),
AnyString::FString(expr) => ExpressionRef::FString(expr),
}
}
}
pub(super) enum AnyStringPartsIter<'a> {
String(std::slice::Iter<'a, StringLiteral>),
Bytes(std::slice::Iter<'a, ast::BytesLiteral>),
FString(std::slice::Iter<'a, ast::FStringPart>, Quoting),
}
impl<'a> Iterator for AnyStringPartsIter<'a> {
type Item = AnyStringPart<'a>;
fn next(&mut self) -> Option<Self::Item> {
let part = match self {
Self::String(inner) => {
let part = inner.next()?;
AnyStringPart::String {
part,
layout: StringLiteralKind::String,
}
}
Self::Bytes(inner) => AnyStringPart::Bytes(inner.next()?),
Self::FString(inner, quoting) => {
let part = inner.next()?;
match part {
ast::FStringPart::Literal(string_literal) => AnyStringPart::String {
part: string_literal,
layout: StringLiteralKind::InImplicitlyConcatenatedFString(*quoting),
},
ast::FStringPart::FString(f_string) => AnyStringPart::FString {
part: f_string,
quoting: *quoting,
},
}
}
};
Some(part)
}
}
impl FusedIterator for AnyStringPartsIter<'_> {}
/// Represents any kind of string which is part of an implicitly concatenated
/// string. This could be either a string, bytes or f-string.
///
/// This is constructed from the [`AnyString::parts`] method on [`AnyString`].
#[derive(Clone, Debug)]
pub(super) enum AnyStringPart<'a> {
String {
part: &'a ast::StringLiteral,
layout: StringLiteralKind,
},
Bytes(&'a ast::BytesLiteral),
FString {
part: &'a ast::FString,
quoting: Quoting,
},
}
impl<'a> From<&AnyStringPart<'a>> for AnyNodeRef<'a> {
fn from(value: &AnyStringPart<'a>) -> Self {
match value {
AnyStringPart::String { part, .. } => AnyNodeRef::StringLiteral(part),
AnyStringPart::Bytes(part) => AnyNodeRef::BytesLiteral(part),
AnyStringPart::FString { part, .. } => AnyNodeRef::FString(part),
}
}
}
impl Ranged for AnyStringPart<'_> {
fn range(&self) -> TextRange {
match self {
Self::String { part, .. } => part.range(),
Self::Bytes(part) => part.range(),
Self::FString { part, .. } => part.range(),
}
}
}
impl Format<PyFormatContext<'_>> for AnyStringPart<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
match self {
AnyStringPart::String { part, layout } => {
FormatStringLiteral::new(part, *layout).fmt(f)
}
AnyStringPart::Bytes(bytes_literal) => bytes_literal.format().fmt(f),
AnyStringPart::FString { part, quoting } => FormatFString::new(part, *quoting).fmt(f),
}
}
}

View File

@@ -2,11 +2,13 @@
// "reStructuredText."
#![allow(clippy::doc_markdown)]
use std::cmp::Ordering;
use std::{borrow::Cow, collections::VecDeque};
use itertools::Itertools;
use ruff_formatter::printer::SourceMapGeneration;
use ruff_python_parser::ParseError;
use {once_cell::sync::Lazy, regex::Regex};
use {
ruff_formatter::{write, FormatOptions, IndentStyle, LineWidth, Printed},
@@ -80,9 +82,7 @@ use super::{NormalizedString, QuoteChar};
/// ```
///
/// Tabs are counted by padding them to the next multiple of 8 according to
/// [`str.expandtabs`](https://docs.python.org/3/library/stdtypes.html#str.expandtabs). When
/// we see indentation that contains a tab or any other none ascii-space whitespace we rewrite the
/// string.
/// [`str.expandtabs`](https://docs.python.org/3/library/stdtypes.html#str.expandtabs).
///
/// Additionally, if any line in the docstring has less indentation than the docstring
/// (effectively a negative indentation wrt. to the current level), we pad all lines to the
@@ -104,8 +104,12 @@ use super::{NormalizedString, QuoteChar};
/// line c
/// """
/// ```
/// The indentation is rewritten to all-spaces when using [`IndentStyle::Space`].
/// The formatter preserves tab-indentations when using [`IndentStyle::Tab`], but doesn't convert
/// `indent-width * spaces` to tabs because doing so could break ASCII art and other docstrings
/// that use spaces for alignment.
pub(crate) fn format(normalized: &NormalizedString, f: &mut PyFormatter) -> FormatResult<()> {
let docstring = &normalized.text;
let docstring = &normalized.text();
// Black doesn't change the indentation of docstrings that contain an escaped newline
if contains_unescaped_newline(docstring) {
@@ -121,7 +125,7 @@ pub(crate) fn format(normalized: &NormalizedString, f: &mut PyFormatter) -> Form
let mut lines = docstring.split('\n').peekable();
// Start the string
write!(f, [normalized.prefix, normalized.quotes])?;
write!(f, [normalized.prefix(), normalized.quotes()])?;
// We track where in the source docstring we are (in source code byte offsets)
let mut offset = normalized.start();
@@ -137,7 +141,7 @@ pub(crate) fn format(normalized: &NormalizedString, f: &mut PyFormatter) -> Form
// Edge case: The first line is `""" "content`, so we need to insert chaperone space that keep
// inner quotes and closing quotes from getting to close to avoid `""""content`
if trim_both.starts_with(normalized.quotes.quote_char.as_char()) {
if trim_both.starts_with(normalized.quotes().quote_char.as_char()) {
space().fmt(f)?;
}
@@ -164,7 +168,7 @@ pub(crate) fn format(normalized: &NormalizedString, f: &mut PyFormatter) -> Form
{
space().fmt(f)?;
}
normalized.quotes.fmt(f)?;
normalized.quotes().fmt(f)?;
return Ok(());
}
@@ -176,21 +180,21 @@ pub(crate) fn format(normalized: &NormalizedString, f: &mut PyFormatter) -> Form
// align it with the docstring statement. Conversely, if all lines are over-indented, we strip
// the extra indentation. We call this stripped indentation since it's relative to the block
// indent printer-made indentation.
let stripped_indentation_length = lines
let stripped_indentation = lines
.clone()
// We don't want to count whitespace-only lines as miss-indented
.filter(|line| !line.trim().is_empty())
.map(indentation_length)
.min()
.map(Indentation::from_str)
.min_by_key(|indentation| indentation.width())
.unwrap_or_default();
DocstringLinePrinter {
f,
action_queue: VecDeque::new(),
offset,
stripped_indentation_length,
stripped_indentation,
already_normalized,
quote_char: normalized.quotes.quote_char,
quote_char: normalized.quotes().quote_char,
code_example: CodeExample::default(),
}
.add_iter(lines)?;
@@ -203,7 +207,7 @@ pub(crate) fn format(normalized: &NormalizedString, f: &mut PyFormatter) -> Form
space().fmt(f)?;
}
write!(f, [normalized.quotes])
write!(f, [normalized.quotes()])
}
fn contains_unescaped_newline(haystack: &str) -> bool {
@@ -240,9 +244,9 @@ struct DocstringLinePrinter<'ast, 'buf, 'fmt, 'src> {
/// printed.
offset: TextSize,
/// Indentation alignment (in columns) based on the least indented line in the
/// Indentation alignment based on the least indented line in the
/// docstring.
stripped_indentation_length: usize,
stripped_indentation: Indentation,
/// Whether the docstring is overall already considered normalized. When it
/// is, the formatter can take a fast path.
@@ -345,7 +349,7 @@ impl<'ast, 'buf, 'fmt, 'src> DocstringLinePrinter<'ast, 'buf, 'fmt, 'src> {
};
// This looks suspicious, but it's consistent with the whitespace
// normalization that will occur anyway.
let indent = " ".repeat(min_indent);
let indent = " ".repeat(min_indent.width());
for docline in formatted_lines {
self.print_one(
&docline.map(|line| std::format!("{indent}{line}")),
@@ -355,7 +359,7 @@ impl<'ast, 'buf, 'fmt, 'src> DocstringLinePrinter<'ast, 'buf, 'fmt, 'src> {
CodeExampleKind::Markdown(fenced) => {
// This looks suspicious, but it's consistent with the whitespace
// normalization that will occur anyway.
let indent = " ".repeat(fenced.opening_fence_indent);
let indent = " ".repeat(fenced.opening_fence_indent.width());
for docline in formatted_lines {
self.print_one(
&docline.map(|line| std::format!("{indent}{line}")),
@@ -387,12 +391,58 @@ impl<'ast, 'buf, 'fmt, 'src> DocstringLinePrinter<'ast, 'buf, 'fmt, 'src> {
};
}
let tab_or_non_ascii_space = trim_end
.chars()
.take_while(|c| c.is_whitespace())
.any(|c| c != ' ');
let indent_offset = match self.f.options().indent_style() {
// Normalize all indent to spaces.
IndentStyle::Space => {
let tab_or_non_ascii_space = trim_end
.chars()
.take_while(|c| c.is_whitespace())
.any(|c| c != ' ');
if tab_or_non_ascii_space {
if tab_or_non_ascii_space {
None
} else {
// It's guaranteed that the `indent` is all spaces because `tab_or_non_ascii_space` is
// `false` (indent contains neither tabs nor non-space whitespace).
let stripped_indentation_len = self.stripped_indentation.text_len();
// Take the string with the trailing whitespace removed, then also
// skip the leading whitespace.
Some(stripped_indentation_len)
}
}
IndentStyle::Tab => {
let line_indent = Indentation::from_str(trim_end);
let non_ascii_whitespace = trim_end
.chars()
.take_while(|c| c.is_whitespace())
.any(|c| !matches!(c, ' ' | '\t'));
let trimmed = line_indent.trim_start(self.stripped_indentation);
// Preserve tabs that are used for indentation, but only if the indent isn't
// * a mix of tabs and spaces
// * the `stripped_indentation` is a prefix of the line's indent
// * the trimmed indent isn't spaces followed by tabs because that would result in a
// mixed tab, spaces, tab indentation, resulting in instabilities.
let preserve_indent = !non_ascii_whitespace
&& trimmed.is_some_and(|trimmed| !trimmed.is_spaces_tabs());
preserve_indent.then_some(self.stripped_indentation.text_len())
}
};
if let Some(indent_offset) = indent_offset {
// Take the string with the trailing whitespace removed, then also
// skip the leading whitespace.
if self.already_normalized {
let trimmed_line_range =
TextRange::at(line.offset, trim_end.text_len()).add_start(indent_offset);
source_text_slice(trimmed_line_range).fmt(self.f)?;
} else {
text(&trim_end[indent_offset.to_usize()..]).fmt(self.f)?;
}
} else {
// We strip the indentation that is shared with the docstring
// statement, unless a line was indented less than the docstring
// statement, in which case we strip only this much indentation to
@@ -400,24 +450,11 @@ impl<'ast, 'buf, 'fmt, 'src> DocstringLinePrinter<'ast, 'buf, 'fmt, 'src> {
// overindented, in which case we strip the additional whitespace
// (see example in [`format_docstring`] doc comment). We then
// prepend the in-docstring indentation to the string.
let indent_len = indentation_length(trim_end) - self.stripped_indentation_length;
let indent_len =
Indentation::from_str(trim_end).width() - self.stripped_indentation.width();
let in_docstring_indent = " ".repeat(indent_len) + trim_end.trim_start();
text(&in_docstring_indent).fmt(self.f)?;
} else {
// It's guaranteed that the `indent` is all spaces because `tab_or_non_ascii_space` is
// `false` (indent contains neither tabs nor non-space whitespace).
// Take the string with the trailing whitespace removed, then also
// skip the leading whitespace.
let trimmed_line_range = TextRange::at(line.offset, trim_end.text_len())
.add_start(TextSize::try_from(self.stripped_indentation_length).unwrap());
if self.already_normalized {
source_text_slice(trimmed_line_range).fmt(self.f)?;
} else {
// All indents are ascii spaces, so the slicing is correct.
text(&trim_end[self.stripped_indentation_length..]).fmt(self.f)?;
}
}
};
// We handled the case that the closing quotes are on their own line
// above (the last line is empty except for whitespace). If they are on
@@ -898,8 +935,7 @@ struct CodeExampleRst<'src> {
/// The lines that have been seen so far that make up the block.
lines: Vec<CodeExampleLine<'src>>,
/// The indent of the line "opening" this block measured via
/// `indentation_length` (in columns).
/// The indent of the line "opening" this block in columns.
///
/// It can either be the indent of a line ending with `::` (for a literal
/// block) or the indent of a line starting with `.. ` (a directive).
@@ -907,9 +943,9 @@ struct CodeExampleRst<'src> {
/// The content body of a block needs to be indented more than the line
/// opening the block, so we use this indentation to look for indentation
/// that is "more than" it.
opening_indent: usize,
opening_indent: Indentation,
/// The minimum indent of the block measured via `indentation_length`.
/// The minimum indent of the block in columns.
///
/// This is `None` until the first such line is seen. If no such line is
/// found, then we consider it an invalid block and bail out of trying to
@@ -926,7 +962,7 @@ struct CodeExampleRst<'src> {
/// When the code snippet has been extracted, it is re-built before being
/// reformatted. The minimum indent is stripped from each line when it is
/// re-built.
min_indent: Option<usize>,
min_indent: Option<Indentation>,
/// Whether this is a directive block or not. When not a directive, this is
/// a literal block. The main difference between them is that they start
@@ -975,7 +1011,7 @@ impl<'src> CodeExampleRst<'src> {
}
Some(CodeExampleRst {
lines: vec![],
opening_indent: indentation_length(opening_indent),
opening_indent: Indentation::from_str(opening_indent),
min_indent: None,
is_directive: false,
})
@@ -1013,7 +1049,7 @@ impl<'src> CodeExampleRst<'src> {
}
Some(CodeExampleRst {
lines: vec![],
opening_indent: indentation_length(original.line),
opening_indent: Indentation::from_str(original.line),
min_indent: None,
is_directive: true,
})
@@ -1033,7 +1069,7 @@ impl<'src> CodeExampleRst<'src> {
line.code = if line.original.line.trim().is_empty() {
""
} else {
indentation_trim(min_indent, line.original.line)
min_indent.trim_start_str(line.original.line)
};
}
&self.lines
@@ -1070,7 +1106,9 @@ impl<'src> CodeExampleRst<'src> {
// an empty line followed by an unindented non-empty line.
if let Some(next) = original.next {
let (next_indent, next_rest) = indent_with_suffix(next);
if !next_rest.is_empty() && indentation_length(next_indent) <= self.opening_indent {
if !next_rest.is_empty()
&& Indentation::from_str(next_indent) <= self.opening_indent
{
self.push_format_action(queue);
return None;
}
@@ -1082,7 +1120,7 @@ impl<'src> CodeExampleRst<'src> {
queue.push_back(CodeExampleAddAction::Kept);
return Some(self);
}
let indent_len = indentation_length(indent);
let indent_len = Indentation::from_str(indent);
if indent_len <= self.opening_indent {
// If we find an unindented non-empty line at the same (or less)
// indentation of the opening line at this point, then we know it
@@ -1144,7 +1182,7 @@ impl<'src> CodeExampleRst<'src> {
queue.push_back(CodeExampleAddAction::Print { original });
return Some(self);
}
let min_indent = indentation_length(indent);
let min_indent = Indentation::from_str(indent);
// At this point, we found a non-empty line. The only thing we require
// is that its indentation is strictly greater than the indentation of
// the line containing the `::`. Otherwise, we treat this as an invalid
@@ -1218,12 +1256,11 @@ struct CodeExampleMarkdown<'src> {
/// The lines that have been seen so far that make up the block.
lines: Vec<CodeExampleLine<'src>>,
/// The indent of the line "opening" fence of this block measured via
/// `indentation_length` (in columns).
/// The indent of the line "opening" fence of this block in columns.
///
/// This indentation is trimmed from the indentation of every line in the
/// body of the code block,
opening_fence_indent: usize,
opening_fence_indent: Indentation,
/// The kind of fence, backticks or tildes, used for this block. We need to
/// keep track of which kind was used to open the block in order to look
@@ -1292,7 +1329,7 @@ impl<'src> CodeExampleMarkdown<'src> {
};
Some(CodeExampleMarkdown {
lines: vec![],
opening_fence_indent: indentation_length(opening_fence_indent),
opening_fence_indent: Indentation::from_str(opening_fence_indent),
fence_kind,
fence_len,
})
@@ -1325,7 +1362,7 @@ impl<'src> CodeExampleMarkdown<'src> {
// its indent normalized. And, at the time of writing, a subsequent
// formatting run undoes this indentation, thus violating idempotency.
if !original.line.trim_whitespace().is_empty()
&& indentation_length(original.line) < self.opening_fence_indent
&& Indentation::from_str(original.line) < self.opening_fence_indent
{
queue.push_back(self.into_reset_action());
queue.push_back(CodeExampleAddAction::Print { original });
@@ -1371,7 +1408,7 @@ impl<'src> CodeExampleMarkdown<'src> {
// Unlike reStructuredText blocks, for Markdown fenced code blocks, the
// indentation that we want to strip from each line is known when the
// block is opened. So we can strip it as we collect lines.
let code = indentation_trim(self.opening_fence_indent, original.line);
let code = self.opening_fence_indent.trim_start_str(original.line);
self.lines.push(CodeExampleLine { original, code });
}
@@ -1486,7 +1523,6 @@ enum CodeExampleAddAction<'src> {
/// results in that code example becoming invalid. In this case,
/// we don't want to treat it as a code example, but instead write
/// back the lines to the docstring unchanged.
#[allow(dead_code)] // FIXME: remove when reStructuredText support is added
Reset {
/// The lines of code that we collected but should be printed back to
/// the docstring as-is and not formatted.
@@ -1533,57 +1569,245 @@ fn docstring_format_source(
/// that avoids `content""""` and `content\"""`. This does only applies to un-escaped backslashes,
/// so `content\\ """` doesn't need a space while `content\\\ """` does.
fn needs_chaperone_space(normalized: &NormalizedString, trim_end: &str) -> bool {
trim_end.ends_with(normalized.quotes.quote_char.as_char())
trim_end.ends_with(normalized.quotes().quote_char.as_char())
|| trim_end.chars().rev().take_while(|c| *c == '\\').count() % 2 == 1
}
/// Returns the indentation's visual width in columns/spaces.
///
/// For docstring indentation, black counts spaces as 1 and tabs by increasing the indentation up
/// to the next multiple of 8. This is effectively a port of
/// [`str.expandtabs`](https://docs.python.org/3/library/stdtypes.html#str.expandtabs),
/// which black [calls with the default tab width of 8](https://github.com/psf/black/blob/c36e468794f9256d5e922c399240d49782ba04f1/src/black/strings.py#L61).
fn indentation_length(line: &str) -> usize {
let mut indentation = 0usize;
for char in line.chars() {
if char == '\t' {
// Pad to the next multiple of tab_width
indentation += 8 - (indentation.rem_euclid(8));
} else if char.is_whitespace() {
indentation += char.len_utf8();
} else {
break;
}
}
indentation
#[derive(Copy, Clone, Debug)]
enum Indentation {
/// Space only indentation or an empty indentation.
///
/// The value is the number of spaces.
Spaces(usize),
/// Tabs only indentation.
Tabs(usize),
/// Indentation that uses tabs followed by spaces.
/// Also known as smart tabs where tabs are used for indents, and spaces for alignment.
TabSpaces { tabs: usize, spaces: usize },
/// Indentation that uses spaces followed by tabs.
SpacesTabs { spaces: usize, tabs: usize },
/// Mixed indentation of tabs and spaces.
Mixed {
/// The visual width of the indentation in columns.
width: usize,
/// The length of the indentation in bytes
len: TextSize,
},
}
/// Trims at most `indent_len` indentation from the beginning of `line`.
///
/// This treats indentation in precisely the same way as `indentation_length`.
/// As such, it is expected that `indent_len` is computed from
/// `indentation_length`. This is useful when one needs to trim some minimum
/// level of indentation from a code snippet collected from a docstring before
/// attempting to reformat it.
fn indentation_trim(indent_len: usize, line: &str) -> &str {
let mut seen_indent_len = 0;
let mut trimmed = line;
for char in line.chars() {
if seen_indent_len >= indent_len {
return trimmed;
impl Indentation {
const TAB_INDENT_WIDTH: usize = 8;
fn from_str(s: &str) -> Self {
let mut iter = s.chars().peekable();
let spaces = iter.peeking_take_while(|c| *c == ' ').count();
let tabs = iter.peeking_take_while(|c| *c == '\t').count();
if tabs == 0 {
// No indent, or spaces only indent
return Self::Spaces(spaces);
}
if char == '\t' {
// Pad to the next multiple of tab_width
seen_indent_len += 8 - (seen_indent_len.rem_euclid(8));
trimmed = &trimmed[1..];
} else if char.is_whitespace() {
seen_indent_len += char.len_utf8();
trimmed = &trimmed[char.len_utf8()..];
} else {
break;
let align_spaces = iter.peeking_take_while(|c| *c == ' ').count();
if spaces == 0 {
if align_spaces == 0 {
return Self::Tabs(tabs);
}
// At this point it's either a smart tab (tabs followed by spaces) or a wild mix of tabs and spaces.
if iter.peek().copied() != Some('\t') {
return Self::TabSpaces {
tabs,
spaces: align_spaces,
};
}
} else if align_spaces == 0 {
return Self::SpacesTabs { spaces, tabs };
}
// Sequence of spaces.. tabs, spaces, tabs...
let mut width = spaces + tabs * Self::TAB_INDENT_WIDTH + align_spaces;
// SAFETY: Safe because Ruff doesn't support files larger than 4GB.
let mut len = TextSize::try_from(spaces + tabs + align_spaces).unwrap();
for char in iter {
if char == '\t' {
// Pad to the next multiple of tab_width
width += Self::TAB_INDENT_WIDTH - (width.rem_euclid(Self::TAB_INDENT_WIDTH));
len += '\t'.text_len();
} else if char.is_whitespace() {
width += char.len_utf8();
len += char.text_len();
} else {
break;
}
}
// Mixed tabs and spaces
Self::Mixed { width, len }
}
/// Returns the indentation's visual width in columns/spaces.
///
/// For docstring indentation, black counts spaces as 1 and tabs by increasing the indentation up
/// to the next multiple of 8. This is effectively a port of
/// [`str.expandtabs`](https://docs.python.org/3/library/stdtypes.html#str.expandtabs),
/// which black [calls with the default tab width of 8](https://github.com/psf/black/blob/c36e468794f9256d5e922c399240d49782ba04f1/src/black/strings.py#L61).
const fn width(self) -> usize {
match self {
Self::Spaces(count) => count,
Self::Tabs(count) => count * Self::TAB_INDENT_WIDTH,
Self::TabSpaces { tabs, spaces } => tabs * Self::TAB_INDENT_WIDTH + spaces,
Self::SpacesTabs { spaces, tabs } => {
let mut indent = spaces;
indent += Self::TAB_INDENT_WIDTH - indent.rem_euclid(Self::TAB_INDENT_WIDTH);
indent + (tabs - 1) * Self::TAB_INDENT_WIDTH
}
Self::Mixed { width, .. } => width,
}
}
trimmed
/// Returns the length of the indentation in bytes.
///
/// # Panics
/// If the indentation is longer than 4GB.
fn text_len(self) -> TextSize {
let len = match self {
Self::Spaces(count) => count,
Self::Tabs(count) => count,
Self::TabSpaces { tabs, spaces } => tabs + spaces,
Self::SpacesTabs { spaces, tabs } => spaces + tabs,
Self::Mixed { len, .. } => return len,
};
TextSize::try_from(len).unwrap()
}
/// Trims the indent of `rhs` by `self`.
///
/// Returns `None` if `self` is not a prefix of `rhs` or either `self` or `rhs` use mixed indentation.
fn trim_start(self, rhs: Self) -> Option<Self> {
let (left_tabs, left_spaces) = match self {
Self::Spaces(spaces) => (0usize, spaces),
Self::Tabs(tabs) => (tabs, 0usize),
Self::TabSpaces { tabs, spaces } => (tabs, spaces),
// Handle spaces here because it is the only indent where the spaces come before the tabs.
Self::SpacesTabs {
spaces: left_spaces,
tabs: left_tabs,
} => {
return match rhs {
Self::Spaces(right_spaces) => {
left_spaces.checked_sub(right_spaces).map(|spaces| {
if spaces == 0 {
Self::Tabs(left_tabs)
} else {
Self::SpacesTabs {
tabs: left_tabs,
spaces,
}
}
})
}
Self::SpacesTabs {
spaces: right_spaces,
tabs: right_tabs,
} => left_spaces.checked_sub(right_spaces).and_then(|spaces| {
let tabs = left_tabs.checked_sub(right_tabs)?;
Some(if spaces == 0 {
if tabs == 0 {
Self::Spaces(0)
} else {
Self::Tabs(tabs)
}
} else {
Self::SpacesTabs { spaces, tabs }
})
}),
_ => None,
}
}
Self::Mixed { .. } => return None,
};
let (right_tabs, right_spaces) = match rhs {
Self::Spaces(spaces) => (0usize, spaces),
Self::Tabs(tabs) => (tabs, 0usize),
Self::TabSpaces { tabs, spaces } => (tabs, spaces),
Self::SpacesTabs { .. } | Self::Mixed { .. } => return None,
};
let tabs = left_tabs.checked_sub(right_tabs)?;
let spaces = left_spaces.checked_sub(right_spaces)?;
Some(if tabs == 0 {
Self::Spaces(spaces)
} else if spaces == 0 {
Self::Tabs(tabs)
} else {
Self::TabSpaces { tabs, spaces }
})
}
/// Trims at most `indent_len` indentation from the beginning of `line`.
///
/// This is useful when one needs to trim some minimum
/// level of indentation from a code snippet collected from a docstring before
/// attempting to reformat it.
fn trim_start_str(self, line: &str) -> &str {
let mut seen_indent_len = 0;
let mut trimmed = line;
let indent_len = self.width();
for char in line.chars() {
if seen_indent_len >= indent_len {
return trimmed;
}
if char == '\t' {
// Pad to the next multiple of tab_width
seen_indent_len +=
Self::TAB_INDENT_WIDTH - (seen_indent_len.rem_euclid(Self::TAB_INDENT_WIDTH));
trimmed = &trimmed[1..];
} else if char.is_whitespace() {
seen_indent_len += char.len_utf8();
trimmed = &trimmed[char.len_utf8()..];
} else {
break;
}
}
trimmed
}
const fn is_spaces_tabs(self) -> bool {
matches!(self, Self::SpacesTabs { .. })
}
}
impl PartialOrd for Indentation {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.width().cmp(&other.width()))
}
}
impl PartialEq for Indentation {
fn eq(&self, other: &Self) -> bool {
self.width() == other.width()
}
}
impl Default for Indentation {
fn default() -> Self {
Self::Spaces(0)
}
}
/// Returns the indentation of the given line and everything following it.
@@ -1613,14 +1837,13 @@ fn is_rst_option(line: &str) -> bool {
#[cfg(test)]
mod tests {
use super::indentation_length;
use crate::string::docstring::Indentation;
#[test]
fn test_indentation_like_black() {
assert_eq!(indentation_length("\t \t \t"), 24);
assert_eq!(indentation_length("\t \t"), 24);
assert_eq!(indentation_length("\t\t\t"), 24);
assert_eq!(indentation_length(" "), 4);
assert_eq!(Indentation::from_str("\t \t \t").width(), 24);
assert_eq!(Indentation::from_str("\t \t").width(), 24);
assert_eq!(Indentation::from_str("\t\t\t").width(), 24);
assert_eq!(Indentation::from_str(" ").width(), 4);
}
}

View File

@@ -1,26 +1,19 @@
use std::borrow::Cow;
use std::iter::FusedIterator;
use bitflags::bitflags;
use memchr::memchr2;
use ruff_formatter::{format_args, write};
use ruff_python_ast::{
self as ast, Expr, ExprBytesLiteral, ExprFString, ExprStringLiteral, ExpressionRef,
};
use ruff_python_ast::{AnyNodeRef, StringLiteral};
pub(crate) use any::AnyString;
pub(crate) use normalize::{normalize_string, NormalizedString, StringNormalizer};
use ruff_formatter::format_args;
use ruff_source_file::Locator;
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};
use ruff_text_size::{TextLen, TextRange, TextSize};
use crate::comments::{leading_comments, trailing_comments};
use crate::expression::expr_f_string::f_string_quoting;
use crate::expression::parentheses::in_parentheses_only_soft_line_break_or_space;
use crate::other::f_string::FormatFString;
use crate::other::string_literal::{FormatStringLiteral, StringLiteralKind};
use crate::prelude::*;
use crate::QuoteStyle;
mod any;
pub(crate) mod docstring;
mod normalize;
#[derive(Copy, Clone, Debug, Default)]
pub(crate) enum Quoting {
@@ -29,202 +22,6 @@ pub(crate) enum Quoting {
Preserve,
}
/// Represents any kind of string expression. This could be either a string,
/// bytes or f-string.
#[derive(Copy, Clone, Debug)]
pub(crate) enum AnyString<'a> {
String(&'a ExprStringLiteral),
Bytes(&'a ExprBytesLiteral),
FString(&'a ExprFString),
}
impl<'a> AnyString<'a> {
/// Creates a new [`AnyString`] from the given [`Expr`].
///
/// Returns `None` if the expression is not either a string, bytes or f-string.
pub(crate) fn from_expression(expression: &'a Expr) -> Option<AnyString<'a>> {
match expression {
Expr::StringLiteral(string) => Some(AnyString::String(string)),
Expr::BytesLiteral(bytes) => Some(AnyString::Bytes(bytes)),
Expr::FString(fstring) => Some(AnyString::FString(fstring)),
_ => None,
}
}
/// Returns `true` if the string is implicitly concatenated.
pub(crate) fn is_implicit_concatenated(self) -> bool {
match self {
Self::String(ExprStringLiteral { value, .. }) => value.is_implicit_concatenated(),
Self::Bytes(ExprBytesLiteral { value, .. }) => value.is_implicit_concatenated(),
Self::FString(ExprFString { value, .. }) => value.is_implicit_concatenated(),
}
}
/// Returns the quoting to be used for this string.
fn quoting(self, locator: &Locator<'_>) -> Quoting {
match self {
Self::String(_) | Self::Bytes(_) => Quoting::CanChange,
Self::FString(f_string) => f_string_quoting(f_string, locator),
}
}
/// Returns a vector of all the [`AnyStringPart`] of this string.
fn parts(self, quoting: Quoting) -> AnyStringPartsIter<'a> {
match self {
Self::String(ExprStringLiteral { value, .. }) => {
AnyStringPartsIter::String(value.iter())
}
Self::Bytes(ExprBytesLiteral { value, .. }) => AnyStringPartsIter::Bytes(value.iter()),
Self::FString(ExprFString { value, .. }) => {
AnyStringPartsIter::FString(value.iter(), quoting)
}
}
}
pub(crate) fn is_multiline(self, source: &str) -> bool {
match self {
AnyString::String(_) | AnyString::Bytes(_) => {
let contents = &source[self.range()];
let prefix = StringPrefix::parse(contents);
let quotes = StringQuotes::parse(
&contents[TextRange::new(prefix.text_len(), contents.text_len())],
);
quotes.is_some_and(StringQuotes::is_triple)
&& memchr2(b'\n', b'\r', contents.as_bytes()).is_some()
}
AnyString::FString(fstring) => {
memchr2(b'\n', b'\r', source[fstring.range].as_bytes()).is_some()
}
}
}
}
impl Ranged for AnyString<'_> {
fn range(&self) -> TextRange {
match self {
Self::String(expr) => expr.range(),
Self::Bytes(expr) => expr.range(),
Self::FString(expr) => expr.range(),
}
}
}
impl<'a> From<&AnyString<'a>> for AnyNodeRef<'a> {
fn from(value: &AnyString<'a>) -> Self {
match value {
AnyString::String(expr) => AnyNodeRef::ExprStringLiteral(expr),
AnyString::Bytes(expr) => AnyNodeRef::ExprBytesLiteral(expr),
AnyString::FString(expr) => AnyNodeRef::ExprFString(expr),
}
}
}
impl<'a> From<AnyString<'a>> for AnyNodeRef<'a> {
fn from(value: AnyString<'a>) -> Self {
AnyNodeRef::from(&value)
}
}
impl<'a> From<&AnyString<'a>> for ExpressionRef<'a> {
fn from(value: &AnyString<'a>) -> Self {
match value {
AnyString::String(expr) => ExpressionRef::StringLiteral(expr),
AnyString::Bytes(expr) => ExpressionRef::BytesLiteral(expr),
AnyString::FString(expr) => ExpressionRef::FString(expr),
}
}
}
enum AnyStringPartsIter<'a> {
String(std::slice::Iter<'a, StringLiteral>),
Bytes(std::slice::Iter<'a, ast::BytesLiteral>),
FString(std::slice::Iter<'a, ast::FStringPart>, Quoting),
}
impl<'a> Iterator for AnyStringPartsIter<'a> {
type Item = AnyStringPart<'a>;
fn next(&mut self) -> Option<Self::Item> {
let part = match self {
Self::String(inner) => {
let part = inner.next()?;
AnyStringPart::String {
part,
layout: StringLiteralKind::String,
}
}
Self::Bytes(inner) => AnyStringPart::Bytes(inner.next()?),
Self::FString(inner, quoting) => {
let part = inner.next()?;
match part {
ast::FStringPart::Literal(string_literal) => AnyStringPart::String {
part: string_literal,
layout: StringLiteralKind::InImplicitlyConcatenatedFString(*quoting),
},
ast::FStringPart::FString(f_string) => AnyStringPart::FString {
part: f_string,
quoting: *quoting,
},
}
}
};
Some(part)
}
}
impl FusedIterator for AnyStringPartsIter<'_> {}
/// Represents any kind of string which is part of an implicitly concatenated
/// string. This could be either a string, bytes or f-string.
///
/// This is constructed from the [`AnyString::parts`] method on [`AnyString`].
#[derive(Clone, Debug)]
enum AnyStringPart<'a> {
String {
part: &'a ast::StringLiteral,
layout: StringLiteralKind,
},
Bytes(&'a ast::BytesLiteral),
FString {
part: &'a ast::FString,
quoting: Quoting,
},
}
impl<'a> From<&AnyStringPart<'a>> for AnyNodeRef<'a> {
fn from(value: &AnyStringPart<'a>) -> Self {
match value {
AnyStringPart::String { part, .. } => AnyNodeRef::StringLiteral(part),
AnyStringPart::Bytes(part) => AnyNodeRef::BytesLiteral(part),
AnyStringPart::FString { part, .. } => AnyNodeRef::FString(part),
}
}
}
impl Ranged for AnyStringPart<'_> {
fn range(&self) -> TextRange {
match self {
Self::String { part, .. } => part.range(),
Self::Bytes(part) => part.range(),
Self::FString { part, .. } => part.range(),
}
}
}
impl Format<PyFormatContext<'_>> for AnyStringPart<'_> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
match self {
AnyStringPart::String { part, layout } => {
FormatStringLiteral::new(part, *layout).fmt(f)
}
AnyStringPart::Bytes(bytes_literal) => bytes_literal.format().fmt(f),
AnyStringPart::FString { part, quoting } => FormatFString::new(part, *quoting).fmt(f),
}
}
}
/// Formats any implicitly concatenated string. This could be any valid combination
/// of string, bytes or f-string literals.
pub(crate) struct FormatStringContinuation<'a> {
@@ -291,139 +88,22 @@ impl StringPart {
}
}
/// Computes the strings preferred quotes and normalizes its content.
///
/// The parent docstring quote style should be set when formatting a code
/// snippet within the docstring. The quote style should correspond to the
/// style of quotes used by said docstring. Normalization will ensure the
/// quoting styles don't conflict.
pub(crate) fn normalize<'a>(
self,
quoting: Quoting,
locator: &'a Locator,
configured_style: QuoteStyle,
parent_docstring_quote_char: Option<QuoteChar>,
normalize_hex: bool,
) -> NormalizedString<'a> {
// Per PEP 8, always prefer double quotes for triple-quoted strings.
let preferred_style = if self.quotes.triple {
// ... unless we're formatting a code snippet inside a docstring,
// then we specifically want to invert our quote style to avoid
// writing out invalid Python.
//
// It's worth pointing out that we can actually wind up being
// somewhat out of sync with PEP8 in this case. Consider this
// example:
//
// def foo():
// '''
// Something.
//
// >>> """tricksy"""
// '''
// pass
//
// Ideally, this would be reformatted as:
//
// def foo():
// """
// Something.
//
// >>> '''tricksy'''
// """
// pass
//
// But the logic here results in the original quoting being
// preserved. This is because the quoting style of the outer
// docstring is determined, in part, by looking at its contents. In
// this case, it notices that it contains a `"""` and thus infers
// that using `'''` would overall read better because it avoids
// the need to escape the interior `"""`. Except... in this case,
// the `"""` is actually part of a code snippet that could get
// reformatted to using a different quoting style itself.
//
// Fixing this would, I believe, require some fairly seismic
// changes to how formatting strings works. Namely, we would need
// to look for code snippets before normalizing the docstring, and
// then figure out the quoting style more holistically by looking
// at the various kinds of quotes used in the code snippets and
// what reformatting them might look like.
//
// Overall this is a bit of a corner case and just inverting the
// style from what the parent ultimately decided upon works, even
// if it doesn't have perfect alignment with PEP8.
if let Some(quote) = parent_docstring_quote_char {
QuoteStyle::from(quote.invert())
} else {
QuoteStyle::Double
}
} else {
configured_style
};
let raw_content = &locator.slice(self.content_range);
let quotes = match quoting {
Quoting::Preserve => self.quotes,
Quoting::CanChange => {
if let Some(preferred_quote) = QuoteChar::from_style(preferred_style) {
if self.prefix.is_raw_string() {
choose_quotes_raw(raw_content, self.quotes, preferred_quote)
} else {
choose_quotes(raw_content, self.quotes, preferred_quote)
}
} else {
self.quotes
}
}
};
let normalized = normalize_string(raw_content, quotes, self.prefix, normalize_hex);
NormalizedString {
prefix: self.prefix,
content_range: self.content_range,
text: normalized,
quotes,
}
/// Returns the prefix of the string part.
pub(crate) const fn prefix(&self) -> StringPrefix {
self.prefix
}
}
#[derive(Debug)]
pub(crate) struct NormalizedString<'a> {
prefix: StringPrefix,
/// Returns the surrounding quotes of the string part.
pub(crate) const fn quotes(&self) -> StringQuotes {
self.quotes
}
/// The quotes of the normalized string (preferred quotes)
quotes: StringQuotes,
/// The range of the string's content in the source (minus prefix and quotes).
content_range: TextRange,
/// The normalized text
text: Cow<'a, str>,
}
impl Ranged for NormalizedString<'_> {
fn range(&self) -> TextRange {
/// Returns the range of the string's content in the source (minus prefix and quotes).
pub(crate) const fn content_range(&self) -> TextRange {
self.content_range
}
}
impl Format<PyFormatContext<'_>> for NormalizedString<'_> {
fn fmt(&self, f: &mut Formatter<PyFormatContext<'_>>) -> FormatResult<()> {
write!(f, [self.prefix, self.quotes])?;
match &self.text {
Cow::Borrowed(_) => {
source_text_slice(self.range()).fmt(f)?;
}
Cow::Owned(normalized) => {
text(normalized).fmt(f)?;
}
}
self.quotes.fmt(f)
}
}
bitflags! {
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub(crate) struct StringPrefix: u8 {
@@ -504,171 +184,6 @@ impl Format<PyFormatContext<'_>> for StringPrefix {
}
}
/// Choose the appropriate quote style for a raw string.
///
/// The preferred quote style is chosen unless the string contains unescaped quotes of the
/// preferred style. For example, `r"foo"` is chosen over `r'foo'` if the preferred quote
/// style is double quotes.
fn choose_quotes_raw(
input: &str,
quotes: StringQuotes,
preferred_quote: QuoteChar,
) -> StringQuotes {
let preferred_quote_char = preferred_quote.as_char();
let mut chars = input.chars().peekable();
let contains_unescaped_configured_quotes = loop {
match chars.next() {
Some('\\') => {
// Ignore escaped characters
chars.next();
}
// `"` or `'`
Some(c) if c == preferred_quote_char => {
if !quotes.triple {
break true;
}
match chars.peek() {
// We can't turn `r'''\""'''` into `r"""\"""""`, this would confuse the parser
// about where the closing triple quotes start
None => break true,
Some(next) if *next == preferred_quote_char => {
// `""` or `''`
chars.next();
// We can't turn `r'''""'''` into `r""""""""`, nor can we have
// `"""` or `'''` respectively inside the string
if chars.peek().is_none() || chars.peek() == Some(&preferred_quote_char) {
break true;
}
}
_ => {}
}
}
Some(_) => continue,
None => break false,
}
};
StringQuotes {
triple: quotes.triple,
quote_char: if contains_unescaped_configured_quotes {
quotes.quote_char
} else {
preferred_quote
},
}
}
/// Choose the appropriate quote style for a string.
///
/// For single quoted strings, the preferred quote style is used, unless the alternative quote style
/// would require fewer escapes.
///
/// For triple quoted strings, the preferred quote style is always used, unless the string contains
/// a triplet of the quote character (e.g., if double quotes are preferred, double quotes will be
/// used unless the string contains `"""`).
fn choose_quotes(input: &str, quotes: StringQuotes, preferred_quote: QuoteChar) -> StringQuotes {
let quote = if quotes.triple {
// True if the string contains a triple quote sequence of the configured quote style.
let mut uses_triple_quotes = false;
let mut chars = input.chars().peekable();
while let Some(c) = chars.next() {
let preferred_quote_char = preferred_quote.as_char();
match c {
'\\' => {
if matches!(chars.peek(), Some('"' | '\\')) {
chars.next();
}
}
// `"` or `'`
c if c == preferred_quote_char => {
match chars.peek().copied() {
Some(c) if c == preferred_quote_char => {
// `""` or `''`
chars.next();
match chars.peek().copied() {
Some(c) if c == preferred_quote_char => {
// `"""` or `'''`
chars.next();
uses_triple_quotes = true;
break;
}
Some(_) => {}
None => {
// Handle `''' ""'''`. At this point we have consumed both
// double quotes, so on the next iteration the iterator is empty
// and we'd miss the string ending with a preferred quote
uses_triple_quotes = true;
break;
}
}
}
Some(_) => {
// A single quote char, this is ok
}
None => {
// Trailing quote at the end of the comment
uses_triple_quotes = true;
break;
}
}
}
_ => continue,
}
}
if uses_triple_quotes {
// String contains a triple quote sequence of the configured quote style.
// Keep the existing quote style.
quotes.quote_char
} else {
preferred_quote
}
} else {
let mut single_quotes = 0u32;
let mut double_quotes = 0u32;
for c in input.chars() {
match c {
'\'' => {
single_quotes += 1;
}
'"' => {
double_quotes += 1;
}
_ => continue,
}
}
match preferred_quote {
QuoteChar::Single => {
if single_quotes > double_quotes {
QuoteChar::Double
} else {
QuoteChar::Single
}
}
QuoteChar::Double => {
if double_quotes > single_quotes {
QuoteChar::Single
} else {
QuoteChar::Double
}
}
}
};
StringQuotes {
triple: quotes.triple,
quote_char: quote,
}
}
#[derive(Copy, Clone, Debug)]
pub(crate) struct StringQuotes {
triple: bool,
@@ -772,269 +287,3 @@ impl TryFrom<char> for QuoteChar {
}
}
}
/// Adds the necessary quote escapes and removes unnecessary escape sequences when quoting `input`
/// with the provided [`StringQuotes`] style.
///
/// Returns the normalized string and whether it contains new lines.
fn normalize_string(
input: &str,
quotes: StringQuotes,
prefix: StringPrefix,
normalize_hex: bool,
) -> Cow<str> {
// The normalized string if `input` is not yet normalized.
// `output` must remain empty if `input` is already normalized.
let mut output = String::new();
// Tracks the last index of `input` that has been written to `output`.
// If `last_index` is `0` at the end, then the input is already normalized and can be returned as is.
let mut last_index = 0;
let quote = quotes.quote_char;
let preferred_quote = quote.as_char();
let opposite_quote = quote.invert().as_char();
let mut chars = input.char_indices().peekable();
let is_raw = prefix.is_raw_string();
let is_fstring = prefix.is_fstring();
let mut formatted_value_nesting = 0u32;
while let Some((index, c)) = chars.next() {
if is_fstring && matches!(c, '{' | '}') {
if chars.peek().copied().is_some_and(|(_, next)| next == c) {
// Skip over the second character of the double braces
chars.next();
} else if c == '{' {
formatted_value_nesting += 1;
} else {
// Safe to assume that `c == '}'` here because of the matched pattern above
formatted_value_nesting = formatted_value_nesting.saturating_sub(1);
}
continue;
}
if c == '\r' {
output.push_str(&input[last_index..index]);
// Skip over the '\r' character, keep the `\n`
if chars.peek().copied().is_some_and(|(_, next)| next == '\n') {
chars.next();
}
// Replace the `\r` with a `\n`
else {
output.push('\n');
}
last_index = index + '\r'.len_utf8();
} else if !is_raw {
if c == '\\' {
if let Some((_, next)) = chars.clone().next() {
if next == '\\' {
// Skip over escaped backslashes
chars.next();
} else if normalize_hex {
if let Some(normalised) = UnicodeEscape::new(next, !prefix.is_byte())
.and_then(|escape| {
escape.normalize(&input[index + c.len_utf8() + next.len_utf8()..])
})
{
// Length of the `\` plus the length of the escape sequence character (`u` | `U` | `x`)
let escape_start_len = '\\'.len_utf8() + next.len_utf8();
let escape_start_offset = index + escape_start_len;
if let Cow::Owned(normalised) = &normalised {
output.push_str(&input[last_index..escape_start_offset]);
output.push_str(normalised);
last_index = escape_start_offset + normalised.len();
};
// Move the `chars` iterator passed the escape sequence.
// Simply reassigning `chars` doesn't work because the indices` would
// then be off.
for _ in 0..next.len_utf8() + normalised.len() {
chars.next();
}
}
}
if !quotes.triple {
#[allow(clippy::if_same_then_else)]
if next == opposite_quote && formatted_value_nesting == 0 {
// Remove the escape by ending before the backslash and starting again with the quote
chars.next();
output.push_str(&input[last_index..index]);
last_index = index + '\\'.len_utf8();
} else if next == preferred_quote {
// Quote is already escaped, skip over it.
chars.next();
}
}
}
} else if !quotes.triple && c == preferred_quote && formatted_value_nesting == 0 {
// Escape the quote
output.push_str(&input[last_index..index]);
output.push('\\');
output.push(c);
last_index = index + preferred_quote.len_utf8();
}
}
}
let normalized = if last_index == 0 {
Cow::Borrowed(input)
} else {
output.push_str(&input[last_index..]);
Cow::Owned(output)
};
normalized
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
enum UnicodeEscape {
/// A hex escape sequence of either 2 (`\x`), 4 (`\u`) or 8 (`\U`) hex characters.
Hex(usize),
/// An escaped unicode name (`\N{name}`)
CharacterName,
}
impl UnicodeEscape {
fn new(first: char, allow_unicode: bool) -> Option<UnicodeEscape> {
Some(match first {
'x' => UnicodeEscape::Hex(2),
'u' if allow_unicode => UnicodeEscape::Hex(4),
'U' if allow_unicode => UnicodeEscape::Hex(8),
'N' if allow_unicode => UnicodeEscape::CharacterName,
_ => return None,
})
}
/// Normalises `\u..`, `\U..`, `\x..` and `\N{..}` escape sequences to:
///
/// * `\u`, `\U'` and `\x`: To use lower case for the characters `a-f`.
/// * `\N`: To use uppercase letters
fn normalize(self, input: &str) -> Option<Cow<str>> {
let mut normalised = String::new();
let len = match self {
UnicodeEscape::Hex(len) => {
// It's not a valid escape sequence if the input string has fewer characters
// left than required by the escape sequence.
if input.len() < len {
return None;
}
for (index, c) in input.char_indices().take(len) {
match c {
'0'..='9' | 'a'..='f' => {
if !normalised.is_empty() {
normalised.push(c);
}
}
'A'..='F' => {
if normalised.is_empty() {
normalised.reserve(len);
normalised.push_str(&input[..index]);
normalised.push(c.to_ascii_lowercase());
} else {
normalised.push(c.to_ascii_lowercase());
}
}
_ => {
// not a valid escape sequence
return None;
}
}
}
len
}
UnicodeEscape::CharacterName => {
let mut char_indices = input.char_indices();
if !matches!(char_indices.next(), Some((_, '{'))) {
return None;
}
loop {
if let Some((index, c)) = char_indices.next() {
match c {
'}' => {
if !normalised.is_empty() {
normalised.push('}');
}
// Name must be at least two characters long.
if index < 3 {
return None;
}
break index + '}'.len_utf8();
}
'0'..='9' | 'A'..='Z' | ' ' | '-' => {
if !normalised.is_empty() {
normalised.push(c);
}
}
'a'..='z' => {
if normalised.is_empty() {
normalised.reserve(c.len_utf8() + '}'.len_utf8());
normalised.push_str(&input[..index]);
normalised.push(c.to_ascii_uppercase());
} else {
normalised.push(c.to_ascii_uppercase());
}
}
_ => {
// Seems like an invalid escape sequence, don't normalise it.
return None;
}
}
} else {
// Unterminated escape sequence, don't normalise it.
return None;
}
}
}
};
Some(if normalised.is_empty() {
Cow::Borrowed(&input[..len])
} else {
Cow::Owned(normalised)
})
}
}
#[cfg(test)]
mod tests {
use crate::string::{normalize_string, QuoteChar, StringPrefix, StringQuotes, UnicodeEscape};
use std::borrow::Cow;
#[test]
fn normalize_32_escape() {
let escape_sequence = UnicodeEscape::new('U', true).unwrap();
assert_eq!(
Some(Cow::Owned("0001f60e".to_string())),
escape_sequence.normalize("0001F60E")
);
}
#[test]
fn normalize_hex_in_byte_string() {
let input = r"\x89\x50\x4E\x47\x0D\x0A\x1A\x0A";
let normalized = normalize_string(
input,
StringQuotes {
triple: false,
quote_char: QuoteChar::Double,
},
StringPrefix::BYTE,
true,
);
assert_eq!(r"\x89\x50\x4e\x47\x0d\x0a\x1a\x0a", &normalized);
}
}

View File

@@ -0,0 +1,655 @@
use std::borrow::Cow;
use ruff_formatter::FormatContext;
use ruff_source_file::Locator;
use ruff_text_size::{Ranged, TextRange};
use crate::context::FStringState;
use crate::options::PythonVersion;
use crate::prelude::*;
use crate::preview::is_hex_codes_in_unicode_sequences_enabled;
use crate::string::{QuoteChar, Quoting, StringPart, StringPrefix, StringQuotes};
use crate::QuoteStyle;
pub(crate) struct StringNormalizer {
quoting: Quoting,
preferred_quote_style: QuoteStyle,
parent_docstring_quote_char: Option<QuoteChar>,
f_string_state: FStringState,
target_version: PythonVersion,
normalize_hex: bool,
}
impl StringNormalizer {
pub(crate) fn from_context(context: &PyFormatContext<'_>) -> Self {
Self {
quoting: Quoting::default(),
preferred_quote_style: QuoteStyle::default(),
parent_docstring_quote_char: context.docstring(),
f_string_state: context.f_string_state(),
target_version: context.options().target_version(),
normalize_hex: is_hex_codes_in_unicode_sequences_enabled(context),
}
}
pub(crate) fn with_preferred_quote_style(mut self, quote_style: QuoteStyle) -> Self {
self.preferred_quote_style = quote_style;
self
}
pub(crate) fn with_quoting(mut self, quoting: Quoting) -> Self {
self.quoting = quoting;
self
}
/// Computes the strings preferred quotes.
pub(crate) fn choose_quotes(&self, string: &StringPart, locator: &Locator) -> StringQuotes {
// Per PEP 8, always prefer double quotes for triple-quoted strings.
// Except when using quote-style-preserve.
let preferred_style = if string.quotes().triple {
// ... unless we're formatting a code snippet inside a docstring,
// then we specifically want to invert our quote style to avoid
// writing out invalid Python.
//
// It's worth pointing out that we can actually wind up being
// somewhat out of sync with PEP8 in this case. Consider this
// example:
//
// def foo():
// '''
// Something.
//
// >>> """tricksy"""
// '''
// pass
//
// Ideally, this would be reformatted as:
//
// def foo():
// """
// Something.
//
// >>> '''tricksy'''
// """
// pass
//
// But the logic here results in the original quoting being
// preserved. This is because the quoting style of the outer
// docstring is determined, in part, by looking at its contents. In
// this case, it notices that it contains a `"""` and thus infers
// that using `'''` would overall read better because it avoids
// the need to escape the interior `"""`. Except... in this case,
// the `"""` is actually part of a code snippet that could get
// reformatted to using a different quoting style itself.
//
// Fixing this would, I believe, require some fairly seismic
// changes to how formatting strings works. Namely, we would need
// to look for code snippets before normalizing the docstring, and
// then figure out the quoting style more holistically by looking
// at the various kinds of quotes used in the code snippets and
// what reformatting them might look like.
//
// Overall this is a bit of a corner case and just inverting the
// style from what the parent ultimately decided upon works, even
// if it doesn't have perfect alignment with PEP8.
if let Some(quote) = self.parent_docstring_quote_char {
QuoteStyle::from(quote.invert())
} else if self.preferred_quote_style.is_preserve() {
QuoteStyle::Preserve
} else {
QuoteStyle::Double
}
} else {
self.preferred_quote_style
};
let quoting = if let FStringState::InsideExpressionElement(context) = self.f_string_state {
// If we're inside an f-string, we need to make sure to preserve the
// existing quotes unless we're inside a triple-quoted f-string and
// the inner string itself isn't triple-quoted. For example:
//
// ```python
// f"""outer {"inner"}""" # Valid
// f"""outer {"""inner"""}""" # Invalid
// ```
//
// Or, if the target version supports PEP 701.
//
// The reason to preserve the quotes is based on the assumption that
// the original f-string is valid in terms of quoting, and we don't
// want to change that to make it invalid.
if (context.quotes().is_triple() && !string.quotes().is_triple())
|| self.target_version.supports_pep_701()
{
self.quoting
} else {
Quoting::Preserve
}
} else {
self.quoting
};
match quoting {
Quoting::Preserve => string.quotes(),
Quoting::CanChange => {
if let Some(preferred_quote) = QuoteChar::from_style(preferred_style) {
let raw_content = locator.slice(string.content_range());
if string.prefix().is_raw_string() {
choose_quotes_for_raw_string(raw_content, string.quotes(), preferred_quote)
} else {
choose_quotes_impl(raw_content, string.quotes(), preferred_quote)
}
} else {
string.quotes()
}
}
}
}
/// Computes the strings preferred quotes and normalizes its content.
pub(crate) fn normalize<'a>(
&self,
string: &StringPart,
locator: &'a Locator,
) -> NormalizedString<'a> {
let raw_content = locator.slice(string.content_range());
let quotes = self.choose_quotes(string, locator);
let normalized = normalize_string(raw_content, quotes, string.prefix(), self.normalize_hex);
NormalizedString {
prefix: string.prefix(),
content_range: string.content_range(),
text: normalized,
quotes,
}
}
}
#[derive(Debug)]
pub(crate) struct NormalizedString<'a> {
prefix: crate::string::StringPrefix,
/// The quotes of the normalized string (preferred quotes)
quotes: StringQuotes,
/// The range of the string's content in the source (minus prefix and quotes).
content_range: TextRange,
/// The normalized text
text: Cow<'a, str>,
}
impl<'a> NormalizedString<'a> {
pub(crate) fn text(&self) -> &Cow<'a, str> {
&self.text
}
pub(crate) fn quotes(&self) -> StringQuotes {
self.quotes
}
pub(crate) fn prefix(&self) -> StringPrefix {
self.prefix
}
}
impl Ranged for NormalizedString<'_> {
fn range(&self) -> TextRange {
self.content_range
}
}
impl Format<PyFormatContext<'_>> for NormalizedString<'_> {
fn fmt(&self, f: &mut Formatter<PyFormatContext<'_>>) -> FormatResult<()> {
ruff_formatter::write!(f, [self.prefix, self.quotes])?;
match &self.text {
Cow::Borrowed(_) => {
source_text_slice(self.range()).fmt(f)?;
}
Cow::Owned(normalized) => {
text(normalized).fmt(f)?;
}
}
self.quotes.fmt(f)
}
}
/// Choose the appropriate quote style for a raw string.
///
/// The preferred quote style is chosen unless the string contains unescaped quotes of the
/// preferred style. For example, `r"foo"` is chosen over `r'foo'` if the preferred quote
/// style is double quotes.
fn choose_quotes_for_raw_string(
input: &str,
quotes: StringQuotes,
preferred_quote: QuoteChar,
) -> StringQuotes {
let preferred_quote_char = preferred_quote.as_char();
let mut chars = input.chars().peekable();
let contains_unescaped_configured_quotes = loop {
match chars.next() {
Some('\\') => {
// Ignore escaped characters
chars.next();
}
// `"` or `'`
Some(c) if c == preferred_quote_char => {
if !quotes.triple {
break true;
}
match chars.peek() {
// We can't turn `r'''\""'''` into `r"""\"""""`, this would confuse the parser
// about where the closing triple quotes start
None => break true,
Some(next) if *next == preferred_quote_char => {
// `""` or `''`
chars.next();
// We can't turn `r'''""'''` into `r""""""""`, nor can we have
// `"""` or `'''` respectively inside the string
if chars.peek().is_none() || chars.peek() == Some(&preferred_quote_char) {
break true;
}
}
_ => {}
}
}
Some(_) => continue,
None => break false,
}
};
StringQuotes {
triple: quotes.triple,
quote_char: if contains_unescaped_configured_quotes {
quotes.quote_char
} else {
preferred_quote
},
}
}
/// Choose the appropriate quote style for a string.
///
/// For single quoted strings, the preferred quote style is used, unless the alternative quote style
/// would require fewer escapes.
///
/// For triple quoted strings, the preferred quote style is always used, unless the string contains
/// a triplet of the quote character (e.g., if double quotes are preferred, double quotes will be
/// used unless the string contains `"""`).
fn choose_quotes_impl(
input: &str,
quotes: StringQuotes,
preferred_quote: QuoteChar,
) -> StringQuotes {
let quote = if quotes.triple {
// True if the string contains a triple quote sequence of the configured quote style.
let mut uses_triple_quotes = false;
let mut chars = input.chars().peekable();
while let Some(c) = chars.next() {
let preferred_quote_char = preferred_quote.as_char();
match c {
'\\' => {
if matches!(chars.peek(), Some('"' | '\\')) {
chars.next();
}
}
// `"` or `'`
c if c == preferred_quote_char => {
match chars.peek().copied() {
Some(c) if c == preferred_quote_char => {
// `""` or `''`
chars.next();
match chars.peek().copied() {
Some(c) if c == preferred_quote_char => {
// `"""` or `'''`
chars.next();
uses_triple_quotes = true;
break;
}
Some(_) => {}
None => {
// Handle `''' ""'''`. At this point we have consumed both
// double quotes, so on the next iteration the iterator is empty
// and we'd miss the string ending with a preferred quote
uses_triple_quotes = true;
break;
}
}
}
Some(_) => {
// A single quote char, this is ok
}
None => {
// Trailing quote at the end of the comment
uses_triple_quotes = true;
break;
}
}
}
_ => continue,
}
}
if uses_triple_quotes {
// String contains a triple quote sequence of the configured quote style.
// Keep the existing quote style.
quotes.quote_char
} else {
preferred_quote
}
} else {
let mut single_quotes = 0u32;
let mut double_quotes = 0u32;
for c in input.chars() {
match c {
'\'' => {
single_quotes += 1;
}
'"' => {
double_quotes += 1;
}
_ => continue,
}
}
match preferred_quote {
QuoteChar::Single => {
if single_quotes > double_quotes {
QuoteChar::Double
} else {
QuoteChar::Single
}
}
QuoteChar::Double => {
if double_quotes > single_quotes {
QuoteChar::Single
} else {
QuoteChar::Double
}
}
}
};
StringQuotes {
triple: quotes.triple,
quote_char: quote,
}
}
/// Adds the necessary quote escapes and removes unnecessary escape sequences when quoting `input`
/// with the provided [`StringQuotes`] style.
///
/// Returns the normalized string and whether it contains new lines.
pub(crate) fn normalize_string(
input: &str,
quotes: StringQuotes,
prefix: StringPrefix,
normalize_hex: bool,
) -> Cow<str> {
// The normalized string if `input` is not yet normalized.
// `output` must remain empty if `input` is already normalized.
let mut output = String::new();
// Tracks the last index of `input` that has been written to `output`.
// If `last_index` is `0` at the end, then the input is already normalized and can be returned as is.
let mut last_index = 0;
let quote = quotes.quote_char;
let preferred_quote = quote.as_char();
let opposite_quote = quote.invert().as_char();
let mut chars = input.char_indices().peekable();
let is_raw = prefix.is_raw_string();
let is_fstring = prefix.is_fstring();
let mut formatted_value_nesting = 0u32;
while let Some((index, c)) = chars.next() {
if is_fstring && matches!(c, '{' | '}') {
if chars.peek().copied().is_some_and(|(_, next)| next == c) {
// Skip over the second character of the double braces
chars.next();
} else if c == '{' {
formatted_value_nesting += 1;
} else {
// Safe to assume that `c == '}'` here because of the matched pattern above
formatted_value_nesting = formatted_value_nesting.saturating_sub(1);
}
continue;
}
if c == '\r' {
output.push_str(&input[last_index..index]);
// Skip over the '\r' character, keep the `\n`
if chars.peek().copied().is_some_and(|(_, next)| next == '\n') {
chars.next();
}
// Replace the `\r` with a `\n`
else {
output.push('\n');
}
last_index = index + '\r'.len_utf8();
} else if !is_raw {
if c == '\\' {
if let Some((_, next)) = chars.clone().next() {
if next == '\\' {
// Skip over escaped backslashes
chars.next();
} else if normalize_hex {
if let Some(normalised) = UnicodeEscape::new(next, !prefix.is_byte())
.and_then(|escape| {
escape.normalize(&input[index + c.len_utf8() + next.len_utf8()..])
})
{
// Length of the `\` plus the length of the escape sequence character (`u` | `U` | `x`)
let escape_start_len = '\\'.len_utf8() + next.len_utf8();
let escape_start_offset = index + escape_start_len;
if let Cow::Owned(normalised) = &normalised {
output.push_str(&input[last_index..escape_start_offset]);
output.push_str(normalised);
last_index = escape_start_offset + normalised.len();
};
// Move the `chars` iterator passed the escape sequence.
// Simply reassigning `chars` doesn't work because the indices` would
// then be off.
for _ in 0..next.len_utf8() + normalised.len() {
chars.next();
}
}
}
if !quotes.triple {
#[allow(clippy::if_same_then_else)]
if next == opposite_quote && formatted_value_nesting == 0 {
// Remove the escape by ending before the backslash and starting again with the quote
chars.next();
output.push_str(&input[last_index..index]);
last_index = index + '\\'.len_utf8();
} else if next == preferred_quote {
// Quote is already escaped, skip over it.
chars.next();
}
}
}
} else if !quotes.triple && c == preferred_quote && formatted_value_nesting == 0 {
// Escape the quote
output.push_str(&input[last_index..index]);
output.push('\\');
output.push(c);
last_index = index + preferred_quote.len_utf8();
}
}
}
let normalized = if last_index == 0 {
Cow::Borrowed(input)
} else {
output.push_str(&input[last_index..]);
Cow::Owned(output)
};
normalized
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
enum UnicodeEscape {
/// A hex escape sequence of either 2 (`\x`), 4 (`\u`) or 8 (`\U`) hex characters.
Hex(usize),
/// An escaped unicode name (`\N{name}`)
CharacterName,
}
impl UnicodeEscape {
fn new(first: char, allow_unicode: bool) -> Option<UnicodeEscape> {
Some(match first {
'x' => UnicodeEscape::Hex(2),
'u' if allow_unicode => UnicodeEscape::Hex(4),
'U' if allow_unicode => UnicodeEscape::Hex(8),
'N' if allow_unicode => UnicodeEscape::CharacterName,
_ => return None,
})
}
/// Normalises `\u..`, `\U..`, `\x..` and `\N{..}` escape sequences to:
///
/// * `\u`, `\U'` and `\x`: To use lower case for the characters `a-f`.
/// * `\N`: To use uppercase letters
fn normalize(self, input: &str) -> Option<Cow<str>> {
let mut normalised = String::new();
let len = match self {
UnicodeEscape::Hex(len) => {
// It's not a valid escape sequence if the input string has fewer characters
// left than required by the escape sequence.
if input.len() < len {
return None;
}
for (index, c) in input.char_indices().take(len) {
match c {
'0'..='9' | 'a'..='f' => {
if !normalised.is_empty() {
normalised.push(c);
}
}
'A'..='F' => {
if normalised.is_empty() {
normalised.reserve(len);
normalised.push_str(&input[..index]);
normalised.push(c.to_ascii_lowercase());
} else {
normalised.push(c.to_ascii_lowercase());
}
}
_ => {
// not a valid escape sequence
return None;
}
}
}
len
}
UnicodeEscape::CharacterName => {
let mut char_indices = input.char_indices();
if !matches!(char_indices.next(), Some((_, '{'))) {
return None;
}
loop {
if let Some((index, c)) = char_indices.next() {
match c {
'}' => {
if !normalised.is_empty() {
normalised.push('}');
}
// Name must be at least two characters long.
if index < 3 {
return None;
}
break index + '}'.len_utf8();
}
'0'..='9' | 'A'..='Z' | ' ' | '-' => {
if !normalised.is_empty() {
normalised.push(c);
}
}
'a'..='z' => {
if normalised.is_empty() {
normalised.reserve(c.len_utf8() + '}'.len_utf8());
normalised.push_str(&input[..index]);
normalised.push(c.to_ascii_uppercase());
} else {
normalised.push(c.to_ascii_uppercase());
}
}
_ => {
// Seems like an invalid escape sequence, don't normalise it.
return None;
}
}
} else {
// Unterminated escape sequence, don't normalise it.
return None;
}
}
}
};
Some(if normalised.is_empty() {
Cow::Borrowed(&input[..len])
} else {
Cow::Owned(normalised)
})
}
}
#[cfg(test)]
mod tests {
use std::borrow::Cow;
use crate::string::{QuoteChar, StringPrefix, StringQuotes};
use super::{normalize_string, UnicodeEscape};
#[test]
fn normalize_32_escape() {
let escape_sequence = UnicodeEscape::new('U', true).unwrap();
assert_eq!(
Some(Cow::Owned("0001f60e".to_string())),
escape_sequence.normalize("0001F60E")
);
}
#[test]
fn normalize_hex_in_byte_string() {
let input = r"\x89\x50\x4E\x47\x0D\x0A\x1A\x0A";
let normalized = normalize_string(
input,
StringQuotes {
triple: false,
quote_char: QuoteChar::Double,
},
StringPrefix::BYTE,
true,
);
assert_eq!(r"\x89\x50\x4e\x47\x0d\x0a\x1a\x0a", &normalized);
}
}

View File

@@ -873,11 +873,11 @@ impl Ranged for LogicalLine {
}
}
struct VerbatimText {
pub(crate) struct VerbatimText {
verbatim_range: TextRange,
}
fn verbatim_text<T>(item: T) -> VerbatimText
pub(crate) fn verbatim_text<T>(item: T) -> VerbatimText
where
T: Ranged,
{

View File

@@ -902,7 +902,7 @@ log.info(f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share
)
dict_with_lambda_values = {
@@ -524,61 +383,54 @@
@@ -524,65 +383,58 @@
# Complex string concatenations with a method call in the middle.
code = (
@@ -941,7 +941,7 @@ log.info(f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share
log.info(
- "Skipping:"
- f" {desc['db_id']} {foo('bar',x=123)} {'foo' != 'bar'} {(x := 'abc=')} {pos_share=} {desc['status']} {desc['exposure_max']}"
+ f'Skipping: {desc["db_id"]} {foo("bar",x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
+ f'Skipping: {desc["db_id"]} {foo("bar", x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
)
log.info(
@@ -981,6 +981,18 @@ log.info(f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share
)
log.info(
- f"""Skipping: {"a" == 'b'} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
+ f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
log.info(
@@ -590,5 +442,5 @@
)
log.info(
- f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share=} {desc['status']} {desc['exposure_max']}"""
+ f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
```
## Ruff Output
@@ -1394,7 +1406,7 @@ log.info(
)
log.info(
f'Skipping: {desc["db_id"]} {foo("bar",x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
f'Skipping: {desc["db_id"]} {foo("bar", x=123)} {"foo" != "bar"} {(x := "abc=")} {pos_share=} {desc["status"]} {desc["exposure_max"]}'
)
log.info(
@@ -1422,7 +1434,7 @@ log.info(
)
log.info(
f"""Skipping: {"a" == 'b'} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
log.info(
@@ -1430,7 +1442,7 @@ log.info(
)
log.info(
f"""Skipping: {'a' == 'b'} {desc['ms_name']} {money=} {dte=} {pos_share=} {desc['status']} {desc['exposure_max']}"""
f"""Skipping: {"a" == "b"} {desc["ms_name"]} {money=} {dte=} {pos_share=} {desc["status"]} {desc["exposure_max"]}"""
)
```

View File

@@ -832,7 +832,7 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
some_commented_string = ( # This comment stays at the top.
"This string is long but not so long that it needs hahahah toooooo be so greatttt"
@@ -279,36 +280,25 @@
@@ -279,37 +280,26 @@
)
lpar_and_rpar_have_comments = func_call( # LPAR Comment
@@ -852,31 +852,32 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
- f" {'' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
-)
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {'' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
+
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {'{{}}' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
-cmd_fstring = (
- "sudo -E deluge-console info --detailed --sort-reverse=time_added"
- f" {'{{}}' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
-)
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {'{{}}' if ID is None else ID} | perl -nE 'print if /^{field}:/'"
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {{'' if ID is None else ID}} | perl -nE 'print if /^{field}:/'"
-cmd_fstring = (
- "sudo -E deluge-console info --detailed --sort-reverse=time_added {'' if ID is"
- f" None else ID}} | perl -nE 'print if /^{field}:/'"
-)
+cmd_fstring = f"sudo -E deluge-console info --detailed --sort-reverse=time_added {{'' if ID is None else ID}} | perl -nE 'print if /^{field}:/'"
+fstring = f"This string really doesn't need to be an {{{{fstring}}}}, but this one most certainly, absolutely {does}."
+
fstring = (
- "This string really doesn't need to be an {{fstring}}, but this one most"
- f" certainly, absolutely {does}."
+ f"We have to remember to escape {braces}." " Like {these}." f" But not {this}."
)
-
-fstring = f"We have to remember to escape {braces}. Like {{these}}. But not {this}."
-fstring = f"We have to remember to escape {braces}. Like {{these}}. But not {this}."
-
class A:
class B:
@@ -364,10 +354,7 @@
def foo():
if not hasattr(module, name):
@@ -979,7 +980,13 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
)
# The parens should NOT be removed in this case.
@@ -518,88 +494,78 @@
@@ -513,93 +489,83 @@
temp_msg = (
- f"{f'{humanize_number(pos)}.': <{pound_len+2}} "
+ f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)
@@ -1103,7 +1110,13 @@ s = f'Lorem Ipsum is simply dummy text of the printing and typesetting industry:
"6. Click on Create Credential at the top."
'7. At the top click the link for "API key".'
"8. No application restrictions are needed. Click Create at the bottom."
@@ -613,55 +579,40 @@
@@ -608,60 +574,45 @@
# It shouldn't matter if the string prefixes are capitalized.
temp_msg = (
- f"{F'{humanize_number(pos)}.': <{pound_len+2}} "
+ f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)
@@ -1688,7 +1701,7 @@ class X:
temp_msg = (
f"{f'{humanize_number(pos)}.': <{pound_len+2}} "
f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)
@@ -1773,7 +1786,7 @@ message = (
# It shouldn't matter if the string prefixes are capitalized.
temp_msg = (
f"{F'{humanize_number(pos)}.': <{pound_len+2}} "
f"{f'{humanize_number(pos)}.': <{pound_len + 2}} "
f"{balance: <{bal_len + 5}} "
f"<<{author.display_name}>>\n"
)

Some files were not shown because too many files have changed in this diff Show More