Compare commits

..

52 Commits

Author SHA1 Message Date
Micha Reiser
b93d3e6f21 Add formatter to runes 2024-05-01 10:42:53 +02:00
Micha Reiser
523235d6ea First format draft 2024-05-01 10:42:51 +02:00
Charlie Marsh
414990c022 Respect async expressions in comprehension bodies (#11219)
## Summary

We weren't recursing into the comprehension body.
2024-04-30 18:38:31 +00:00
Jane Lewis
4779dd1173 Write ruff server setup guide for Helix (#11183)
## Summary

Closes #11027.
2024-04-30 10:15:29 -07:00
Charlie Marsh
c5adbf17da Ignore non-abstract class attributes when enforcing B024 (#11210)
## Summary

I think the check included here does make sense, but I don't see why we
would allow it if a value is provided for the attribute -- since, in
that case, isn't it _not_ abstract?

Closes: https://github.com/astral-sh/ruff/issues/11208.
2024-04-30 09:01:08 -07:00
Micha Reiser
c6dcf3502b [red-knot] Use FileId in module resolver to map from file to module (#11212) 2024-04-30 14:09:47 +00:00
Micha Reiser
1e585b8667 [red knot] Introduce LintDb (#11204) 2024-04-30 16:01:46 +02:00
Alex Waygood
21d824abfd [pylint] Also emit PLR0206 for properties with variadic parameters (#11200) 2024-04-30 11:59:37 +01:00
Micha Reiser
7e28c80354 [red-knot] Refactor program.check scheduling (#11202) 2024-04-30 07:23:41 +00:00
Micha Reiser
bc03d376e8 [red-knot] Add "cheap" program.snapshot (#11172) 2024-04-30 07:13:26 +00:00
Alex Waygood
eb6f562419 red-knot: introduce a StatisticsRecorder trait for the KeyValueCache (#11179)
## Summary

This PR changes the `DebugStatistics` and `ReleaseStatistics` structs so
that they implement a common `StatisticsRecorder` trait, and makes the
`KeyValueCache` struct generic over a type parameter bound to that
trait. The advantage of this approach is that it's much harder for the
`DebugStatistics` and `ReleaseStatistics` structs to accidentally grow
out of sync in the methods that they implement, which was the cause of
the release-build failure recently fixed in #11177.

## Test Plan

`cargo test -p red_knot` and `cargo build --release` both continue to
pass for me locally
2024-04-30 07:14:06 +01:00
Micha Reiser
5561d445d7 linter: Enable test-rules for test build (#11201) 2024-04-30 08:06:47 +02:00
plredmond
c391c8b6cb Red Knot - Add symbol flags (#11134)
* Adds `Symbol.flag` bitfield. Populates it from (the three renamed)
`add_or_update_symbol*` methods.
* Currently there are these flags supported:
  * `IS_DEFINED` is set in a scope where a variable is defined.
* `IS_USED` is set in a scope where a variable is referenced. (To have
both this and `IS_DEFINED` would require two separate appearances of a
variable in the same scope-- one def and one use.)
* `MARKED_GLOBAL` and `MARKED_NONLOCAL` are **not yet implemented**.
(*TODO: While traversing, if you find these declarations, add these
flags to the variable.*)
* Adds `Symbol.kind` field (commented) and the data structure which will
populate it: `Kind` which is an enum of freevar, cellvar,
implicit_global, and implicit_local. **Not yet populated**. (*TODO: a
second pass over the scope (or the ast?) will observe the
`MARKED_GLOBAL` and `MARKED_NONLOCAL` flags to populate this field. When
that's added, we'll uncomment the field.*)
* Adds a few tests that the `IS_DEFINED` and `IS_USED` fields are
correctly set and/or merged:
* Unit test that subsequent calls to `add_or_update_symbol` will merge
the flag arguments.
* Unit test that in the statement `x = foo`, the variable `foo` is
considered used but not defined.
* Unit test that in the statement `from bar import foo`, the variable
`foo` is considered defined but not used.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2024-04-29 17:07:23 -07:00
Carl Meyer
ce030a467f [red-knot] resolve base class types (#11178)
## Summary

Resolve base class types, as long as they are simple names.

## Test Plan

cargo test
2024-04-29 16:22:30 -06:00
Dhruv Manilawala
04a922866a Add basic docs for the parser crate (#11199)
## Summary

This PR adds a basic README for the `ruff_python_parser` crate and
updates the CONTRIBUTING docs with the fuzzer and benchmark section.

Additionally, it also updates some inline documentation within the
parser crate and splits the `parse_program` function into
`parse_single_expression` and `parse_module` which will be called by
matching against the `Mode`.

This PR doesn't go into too much internal detail around the parser logic
due to the following reasons:
1. Where should the docs go? Should it be as a module docs in `lib.rs`
or in README?
2. The parser is still evolving and could include a lot of refactors
with the future work (feedback loop and improved error recovery and
resilience)

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-04-29 17:08:07 +00:00
Alex Waygood
0ed7af35ec Add a daily workflow to fuzz the parser with randomly selected seeds (#11203) 2024-04-29 17:54:17 +01:00
Alex Waygood
87929ad5f1 Add convenience methods for iterating over all parameter nodes in a function (#11174) 2024-04-29 10:36:15 +00:00
renovate[bot]
8a887daeb4 Update pre-commit dependencies (#11195)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2024-04-29 08:40:21 +00:00
renovate[bot]
7317d734be Update dependency monaco-editor to ^0.48.0 (#11197)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:34:49 +02:00
renovate[bot]
c1a2a60182 Update NPM Development dependencies (#11196)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:33:12 +02:00
renovate[bot]
8e056b3a93 Update Rust crate serde to v1.0.199 (#11192)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:17:15 +02:00
renovate[bot]
616dd1873f Update Rust crate matchit to v0.8.2 (#11189)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:16:32 +02:00
renovate[bot]
acfb1a83c9 Update Rust crate serde_with to v3.8.1 (#11193)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:16:14 +02:00
renovate[bot]
7c0e32f255 Update Rust crate schemars to v0.8.17 (#11191)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:15:38 +02:00
renovate[bot]
4b84c55e3a Update Rust crate parking_lot to v0.12.2 (#11190)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:14:57 +02:00
renovate[bot]
4c8d33ec45 Update Rust crate hashbrown to v0.14.5 (#11188)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:14:33 +02:00
Alex Waygood
113e259e6d Various small improvements to the fuzz-parser script (#11186) 2024-04-28 18:17:27 +00:00
Micha Reiser
3474e37836 [red-knot] Unresolved imports lint rule (#11164) 2024-04-28 12:12:49 +02:00
Charlie Marsh
dfe90a3b2b Add a --release build to CI (#11182)
## Summary

We merged a failure here (#11177), and it only takes ~five minutes
anyway (which is shorter than some of our other jobs).
2024-04-27 20:36:33 -04:00
Micha Reiser
00d7c01cfc [red-knot] Fix absolute imports in module.resolve_name (#11180) 2024-04-27 20:07:07 +02:00
Micha Reiser
983a06cec3 [red-knot] Resolve and check dependencies (#11161) 2024-04-27 15:49:03 +00:00
Alex Waygood
47692027bf Fix cargo build --release (#11177)
## Summary

`cargo build --release` currently fails to compile on `main`:

<details>

```
error[E0599]: no method named `hit` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:22:29
    |
22  |             self.statistics.hit();
    |                             ^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `hit` not found for this struct

error[E0599]: no method named `miss` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:25:29
    |
25  |             self.statistics.miss();
    |                             ^^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `miss` not found for this struct

error[E0599]: no method named `hit` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:36:33
    |
36  |                 self.statistics.hit();
    |                                 ^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `hit` not found for this struct

error[E0599]: no method named `miss` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:41:33
    |
41  |                 self.statistics.miss();
    |                                 ^^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `miss` not found for this struct
```

</details>

This is because in a release build, `CacheStatistics` is a type alias
for `ReleaseStatistics`, and `ReleaseStatistics` doesn't have `hit()` or
`miss()` methods. (In a debug build, `CacheStatistics` is a type alias
for `DebugStatistics`, which _does_ have those methods.)

Possibly we could make this less likely to happen in the future by
making both structs implement a common trait instead of using type
aliases that vary depending on whether it's a debug build or not? For
now, though, this PR just brings the two structs in sync w.r.t. the
methods they expose.

## Test Plan

`cargo build --release` now once again compiles for me locally
2024-04-27 11:45:32 -04:00
Charlie Marsh
ec3243a6e5 Prioritize redefined-while-unused over unused-import (#11173)
## Summary

This PR adds an override to the fixer to ensure that we apply any
`redefined-while-unused` fixes prior to `unused-import`.

Closes https://github.com/astral-sh/ruff/issues/10905.
2024-04-27 11:44:53 -04:00
Carl Meyer
2d6978f236 [red-knot] fix class vs instance (#11175)
## Summary

Clarify the type of an exact class object vs the type of instances of
that class.

## Test Plan

cargo test
2024-04-27 09:09:02 -06:00
Auguste Lalande
2490d2d4af [ruff] Detect duplicate codes as part of unused-noqa (RUF100) (#10850)
## Summary

Implement duplicate code detection as part of `RUF100`, mirroring the
behavior of `flake8-noqa` (`NQA005`) mentioned in #850. The idea to
merge the rule into `RUF100` was suggested by @MichaReiser
https://github.com/astral-sh/ruff/pull/10325#issuecomment-2025535444.

## Test Plan

Test cases were added to the fixture.
2024-04-27 12:33:44 +00:00
Jelle Zijlstra
59b73fabc1 [pyflakes] Improve invalid-print-syntax documentation (#11171)
This syntax wasn't "deprecated" in Python 3; it was removed.

I started looking at this rule because I was curious how Ruff could even
detect this without a Python 2 parser. Then I realized that
"print >> f, x" is actually valid Python 3 syntax: it creates a tuple
containing a right-shifted version of the print function.
2024-04-27 07:54:04 -04:00
Micha Reiser
61c97a037c red-knot: Introduce program.check (#11148) 2024-04-27 09:01:20 +00:00
Micha Reiser
7cd065e4a2 Kick off Red-knot (#10849)
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Carl Meyer <carl@astral.sh>
2024-04-27 08:34:00 +00:00
Carl Meyer
845ba7cf5f Make ImportFrom level just a u32 (#11170) 2024-04-26 20:38:35 -06:00
Auguste Lalande
5994414739 [ruff] Implement redirected-noqa (RUF101) (#11052)
## Summary

Based on discussion in #10850.

As it stands today `RUF100` will attempt to replace code redirects with
their target codes even though this is not the "goal" of `RUF100`. This
behavior is confusing and inconsistent, since code redirects which don't
otherwise violate `RUF100` will not be updated. The behavior is also
undocumented. Additionally, users who want to use `RUF100` but do not
want to update redirects have no way to opt out.

This PR explicitly detects redirects with a new rule `RUF101` and
patches `RUF100` to keep original codes in fixes and reporting.

## Test Plan

Added fixture.
2024-04-27 02:08:11 +00:00
Jane Lewis
632965d0fa ruff server: Support a custom TOML configuration file (#11140)
## Summary

Closes #10985.

The server now supports a custom TOML configuration file as a client
setting. The setting must be an absolute path to a file. If the file is
called `pyproject.toml`, the server will attempt to parse it as a
pyproject file - otherwise, it will attempt to parse it as a `ruff.toml`
file, even if the file has a name besides `ruff.toml`.

If an option is set in both the custom TOML configuration file and in
the client settings directly, the latter will be used.

## Test Plan

1. Create a `ruff.toml` file outside of the workspace you are testing.
Set an option that is different from the one in the configuration for
your test workspace.
2. Set the path to the configuration in NeoVim:
```lua
require('lspconfig').ruff.setup {
    init_options = {
      settings = {
        configuration = "absolute/path/to/your/configuration"
      }
    }
}
```
3. Confirm that the option in the configuration file is used, regardless
of what the option is set to in the workspace configuration.
4. Add the same option, with a different value, to the NeoVim
configuration directly. For example:
```lua
require('lspconfig').ruff.setup {
    init_options = {
      settings = {
        configuration = "absolute/path/to/your/configuration",
        lint = {
          select = []
        }
      }
    }
}
```
5. Confirm that the option set in client settings is used, regardless of
the value in either the custom configuration file or in the workspace
configuration.
2024-04-26 23:46:07 +00:00
Dhruv Manilawala
77a72ecd38 Avoid multiline expression if format specifier is present (#11123)
## Summary

This PR fixes the bug where the formatter would format an f-string and
could potentially change the AST.

For a triple-quoted f-string, the element can't be formatted into
multiline if it has a format specifier because otherwise the newline
would be treated as part of the format specifier.

Given the following f-string:
```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f} ddddddddddddddd eeeeeeee"""
```

The formatter sees that the f-string is already multiline so it assumes
that it can contain line breaks i.e., broken into multiple lines. But,
in this specific case we can't format it as:

```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f
} ddddddddddddddd eeeeeeee"""
```
                     
Because the format specifier string would become ".3f\n", which is not
the original string (`.3f`).

If the original source code already contained a newline, they'll be
preserved. For example:
```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f
} ddddddddddddddd eeeeeeee"""
```

The above will be formatted as:
```py
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {variable:.3f
} ddddddddddddddd eeeeeeee"""
```

Note that the newline after `.3f` is part of the format specifier which
needs to be preserved.
The Python version is irrelevant in this case.

fixes: #10040 

## Test Plan

Add some test cases to verify this behavior.
2024-04-26 13:34:38 +00:00
Jane Lewis
16a1f3cbcc ruff server: Support setting to prioritize project configuration over editor configuration (#11086)
## Summary

This is intended to address
https://github.com/astral-sh/ruff-vscode/issues/425, and is a follow-up
to https://github.com/astral-sh/ruff/pull/11062.

A new client setting is now supported by the server,
`prioritizeFileConfiguration`. This is a boolean setting (default:
`false`) that, if set to `true`, will instruct the configuration
resolver to prioritize file configuration (aka discovered TOML files)
over configuration passed in by the editor.

A corresponding extension PR has been opened, which makes this setting
available for VS Code:
https://github.com/astral-sh/ruff-vscode/pull/457.

## Test Plan

To test this with VS Code, you'll need to check out [the VS Code
PR](https://github.com/astral-sh/ruff-vscode/pull/457) that adds this
setting.

The test process is similar to
https://github.com/astral-sh/ruff/pull/11062, but in scenarios where the
editor configuration would take priority over file configuration, file
configuration should take priority.
2024-04-26 08:17:28 +00:00
Jelle Zijlstra
cd3e319538 Add support for PEP 696 syntax (#11120) 2024-04-26 09:47:29 +02:00
Shi Sheng
45725d3275 Update README.md (#11156)
## Summary

Sorry about an oversight, fixed the LICENSE redirection near the bottom
part of the README file. Now also is the full path
2024-04-25 22:46:47 -04:00
Shi Sheng
bbca8eb388 Update README.md (#11155)
## Summary

Modified the license badge so instead of `LICENSE`, it is now the full
path, and hoepfully will resolve the issue appeared in PyPI with the
license badge redirection
2024-04-25 22:14:21 -04:00
Steve C
c8c227dd5d [refurb] Implement fstring-number-format (FURB116) (#10921)
## Summary

Adds `FURB116`

See #1348 

## Test Plan

`cargo test`
2024-04-26 01:15:33 +00:00
Charlie Marsh
b15e9e6e05 Include inline instantiations when detecting loggers (#11154)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11031.
2024-04-25 21:00:12 -04:00
Ibraheem Ahmed
22d4f11348 Upgrade codspeed-criterion-compat to 2.6.0 (#11153)
## Summary

`codspeed-criterion-compat` 2.6.0 [updates it's package selection
mechanism](https://github.com/CodSpeedHQ/codspeed-rust/pull/43), so
https://github.com/astral-sh/ruff/pull/10735 is no longer needed.

## Test Plan

CI should still be passing.
2024-04-25 20:22:37 -04:00
Alex Waygood
269014a539 Delete unused methods from Parameters (#11150) 2024-04-25 22:11:24 +01:00
Charlie Marsh
dc09f529bc Build a separate ARM wheel for macOS (#11149)
## Summary

Since we already build an x86 wheel, we can just build an ARM wheel
rather than cross-compiling to universal.

The build time is ~3 minutes vs. > 20 minutes and the resulting artifact
is much smaller, which is also a win for users.
2024-04-25 15:06:47 -04:00
Auguste Lalande
3364ef957d [pygrep_hooks] Fix blanket-noqa panic when last line has noqa with no newline (PGH004) (#11108)
## Summary

Resolves #11102

The error stems from these lines

f5c7a62aa6/crates/ruff_linter/src/noqa.rs (L697-L702)
I don't really understand the purpose of incrementing the last index,
but it makes the resulting range invalid for indexing into `contents`.

For now I just detect if the index is too high in `blanket_noqa` and
adjust it if necessary.

## Test Plan

Created fixture from issue example.
2024-04-25 14:39:38 -04:00
217 changed files with 15137 additions and 2204 deletions

View File

@@ -194,6 +194,22 @@ jobs:
cd crates/ruff_wasm
wasm-pack test --node
cargo-build-release:
name: "cargo build (release)"
runs-on: macos-latest
needs: determine_changes
if: ${{ needs.determine_changes.outputs.code == 'true' || github.ref == 'refs/heads/main' }}
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- name: "Install Rust toolchain"
run: rustup show
- name: "Install mold"
uses: rui314/setup-mold@v1
- uses: Swatinem/rust-cache@v2
- name: "Build"
run: cargo build --release --locked
cargo-fuzz:
name: "cargo fuzz"
runs-on: ubuntu-latest
@@ -568,23 +584,8 @@ jobs:
- uses: Swatinem/rust-cache@v2
# Codspeed comes with a very ancient cargo version (1.66) that resolves features flags differently than what we use now.
# This can result in build failures; see https://github.com/astral-sh/ruff/pull/10700.
# There's a pending codspeed PR to upgrade to a newer cargo version, but until that's merged, we need to use the workaround below.
# https://github.com/CodSpeedHQ/codspeed-rust/pull/31
# What we do is to call cargo build manually with the correct feature flags and RUSTC settings. We'll have to
# manually maintain the list of benchmarks to run with codspeed (the benefit is that we could detect which benchmarks to run and build based on the changes).
# This is inspired by https://github.com/oxc-project/oxc/blob/a0532adc654039a0c7ead7b35216dfa0b0cb8e8f/.github/workflows/benchmark.yml
- name: "Build benchmarks"
env:
RUSTFLAGS: "-C debuginfo=2 -C strip=none -g --cfg codspeed"
shell: bash
# Build all benchmarks, copy the binary to the codspeed directory, remove any `*.d` files that might have been created.
run: |
cargo build --release -p ruff_benchmark --bench parser --bench linter --bench formatter --bench lexer --features=codspeed
mkdir -p ./target/codspeed/ruff_benchmark
cp ./target/release/deps/{lexer,parser,linter,formatter}* target/codspeed/ruff_benchmark/
rm -rf ./target/codspeed/ruff_benchmark/*.d
run: cargo codspeed build --features codspeed -p ruff_benchmark
- name: "Run benchmarks"
uses: CodSpeedHQ/action@v2

72
.github/workflows/daily_fuzz.yaml vendored Normal file
View File

@@ -0,0 +1,72 @@
name: Daily parser fuzz
on:
workflow_dispatch:
schedule:
- cron: "0 0 * * *"
pull_request:
paths:
- ".github/workflows/daily_fuzz.yaml"
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
CARGO_INCREMENTAL: 0
CARGO_NET_RETRY: 10
CARGO_TERM_COLOR: always
RUSTUP_MAX_RETRIES: 10
PACKAGE_NAME: ruff
FORCE_COLOR: 1
jobs:
fuzz:
name: Fuzz
runs-on: ubuntu-latest
timeout-minutes: 20
# Don't run the cron job on forks:
if: ${{ github.repository == 'astral-sh/ruff' || github.event_name != 'schedule' }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install uv
run: curl -LsSf https://astral.sh/uv/install.sh | sh
- name: Install Python requirements
run: uv pip install -r scripts/fuzz-parser/requirements.txt --system
- name: "Install Rust toolchain"
run: rustup show
- name: "Install mold"
uses: rui314/setup-mold@v1
- uses: Swatinem/rust-cache@v2
- name: Build ruff
# A debug build means the script runs slower once it gets started,
# but this is outweighed by the fact that a release build takes *much* longer to compile in CI
run: cargo build --locked
- name: Fuzz
run: python scripts/fuzz-parser/fuzz.py $(shuf -i 0-9999999999999999999 -n 1000) --test-executable target/debug/ruff
create-issue-on-failure:
name: Create an issue if the daily fuzz surfaced any bugs
runs-on: ubuntu-latest
needs: fuzz
if: ${{ github.repository == 'astral-sh/ruff' && always() && github.event_name == 'schedule' && needs.fuzz.result == 'failure' }}
permissions:
issues: write
steps:
- uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
await github.rest.issues.create({
owner: "astral-sh",
repo: "ruff",
title: `Daily parser fuzz failed on ${new Date().toDateString()}`,
body: "Runs listed here: https://github.com/astral-sh/ruff/actions/workflows/daily_fuzz.yml",
labels: ["bug", "parser", "fuzzer"],
})

View File

@@ -97,8 +97,8 @@ jobs:
*.tar.gz
*.sha256
macos-universal:
runs-on: macos-12
macos-aarch64:
runs-on: macos-14
steps:
- uses: actions/checkout@v4
with:
@@ -106,16 +106,17 @@ jobs:
- uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
architecture: x64
architecture: arm64
- name: "Prep README.md"
run: python scripts/transform_readme.py --target pypi
- name: "Build wheels - universal2"
- name: "Build wheels - aarch64"
uses: PyO3/maturin-action@v1
with:
args: --release --locked --target universal2-apple-darwin --out dist
- name: "Test wheel - universal2"
target: aarch64
args: --release --locked --out dist
- name: "Test wheel - aarch64"
run: |
pip install dist/${{ env.PACKAGE_NAME }}-*universal2.whl --force-reinstall
pip install dist/${{ env.PACKAGE_NAME }}-*.whl --force-reinstall
ruff --help
python -m ruff --help
- name: "Upload wheels"
@@ -451,7 +452,7 @@ jobs:
name: Upload to PyPI
runs-on: ubuntu-latest
needs:
- macos-universal
- macos-aarch64
- macos-x86_64
- windows
- linux

View File

@@ -41,7 +41,7 @@ repos:
)$
- repo: https://github.com/crate-ci/typos
rev: v1.20.9
rev: v1.20.10
hooks:
- id: typos
@@ -55,7 +55,7 @@ repos:
pass_filenames: false # This makes it a lot faster
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.1
rev: v0.4.2
hooks:
- id: ruff-format
- id: ruff

462
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -24,13 +24,14 @@ chrono = { version = "0.4.35", default-features = false, features = ["clock"] }
clap = { version = "4.5.3", features = ["derive"] }
clap_complete_command = { version = "0.5.1" }
clearscreen = { version = "3.0.0" }
codspeed-criterion-compat = { version = "2.4.0", default-features = false }
codspeed-criterion-compat = { version = "2.6.0", default-features = false }
colored = { version = "2.1.0" }
console_error_panic_hook = { version = "0.1.7" }
console_log = { version = "1.0.0" }
countme = { version = "3.0.1" }
criterion = { version = "0.5.1", default-features = false }
crossbeam-channel = { version = "0.5.12" }
crossbeam = { version = "0.8.4" }
dashmap = { version = "5.5.3" }
dirs = { version = "5.0.0" }
drop_bomb = { version = "0.1.5" }
env_logger = { version = "0.11.0" }
@@ -39,10 +40,12 @@ filetime = { version = "0.2.23" }
fs-err = { version = "2.11.0" }
glob = { version = "0.3.1" }
globset = { version = "0.4.14" }
hashbrown = "0.14.3"
hexf-parse = { version = "0.2.1" }
ignore = { version = "0.4.22" }
imara-diff = { version = "0.1.5" }
imperative = { version = "1.0.4" }
indexmap = { version = "2.2.6" }
indicatif = { version = "0.17.8" }
indoc = { version = "2.0.4" }
insta = { version = "1.35.1", feature = ["filters", "glob"] }
@@ -68,6 +71,7 @@ once_cell = { version = "1.19.0" }
path-absolutize = { version = "3.1.1" }
path-slash = { version = "0.2.1" }
pathdiff = { version = "0.2.1" }
parking_lot = "0.12.1"
pep440_rs = { version = "0.6.0", features = ["serde"] }
pretty_assertions = "1.3.0"
proc-macro2 = { version = "1.0.79" }

View File

@@ -4,7 +4,7 @@
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![image](https://img.shields.io/pypi/v/ruff.svg)](https://pypi.python.org/pypi/ruff)
[![image](https://img.shields.io/pypi/l/ruff.svg)](LICENSE)
[![image](https://img.shields.io/pypi/l/ruff.svg)](https://github.com/astral-sh/ruff/blob/main/LICENSE)
[![image](https://img.shields.io/pypi/pyversions/ruff.svg)](https://pypi.python.org/pypi/ruff)
[![Actions status](https://github.com/astral-sh/ruff/workflows/CI/badge.svg)](https://github.com/astral-sh/ruff/actions)
[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?logo=discord&logoColor=white)](https://discord.com/invite/astral-sh)
@@ -499,7 +499,7 @@ If you're using Ruff, consider adding the Ruff badge to your project's `README.m
## License
This repository is licensed under the [MIT License](LICENSE)
This repository is licensed under the [MIT License](https://github.com/astral-sh/ruff/blob/main/LICENSE)
<div align="center">
<a target="_blank" href="https://astral.sh" style="background:none">

View File

@@ -0,0 +1,47 @@
[package]
name = "red_knot"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
homepage.workspace = true
documentation.workspace = true
repository.workspace = true
authors.workspace = true
license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
ruff_formatter = { path = "../ruff_formatter" }
ruff_index = { path = "../ruff_index" }
ruff_notebook = { path = "../ruff_notebook" }
ruff_python_ast = { path = "../ruff_python_ast" }
ruff_python_formatter = { path = "../ruff_python_formatter" }
ruff_python_parser = { path = "../ruff_python_parser" }
ruff_python_trivia = { path = "../ruff_python_trivia" }
ruff_text_size = { path = "../ruff_text_size" }
anyhow = { workspace = true }
bitflags = { workspace = true }
ctrlc = "3.4.4"
crossbeam = { workspace = true }
dashmap = { workspace = true }
hashbrown = { workspace = true }
indexmap = { workspace = true }
log = { workspace = true }
notify = { workspace = true }
parking_lot = { workspace = true }
rayon = { workspace = true }
rustc-hash = { workspace = true }
smallvec = { workspace = true }
smol_str = "0.2.1"
tracing = { workspace = true }
tracing-subscriber = { workspace = true }
tracing-tree = { workspace = true }
[dev-dependencies]
textwrap = "0.16.1"
tempfile = { workspace = true }
[lints]
workspace = true

View File

@@ -0,0 +1,415 @@
use std::any::type_name;
use std::fmt::{Debug, Formatter};
use std::hash::{Hash, Hasher};
use std::marker::PhantomData;
use rustc_hash::FxHashMap;
use ruff_index::{Idx, IndexVec};
use ruff_python_ast::visitor::preorder;
use ruff_python_ast::visitor::preorder::{PreorderVisitor, TraversalSignal};
use ruff_python_ast::{
AnyNodeRef, AstNode, ExceptHandler, ExceptHandlerExceptHandler, Expr, MatchCase, ModModule,
NodeKind, Parameter, Stmt, StmtAnnAssign, StmtAssign, StmtAugAssign, StmtClassDef,
StmtFunctionDef, StmtGlobal, StmtImport, StmtImportFrom, StmtNonlocal, StmtTypeAlias,
TypeParam, TypeParamParamSpec, TypeParamTypeVar, TypeParamTypeVarTuple, WithItem,
};
use ruff_text_size::{Ranged, TextRange};
/// A type agnostic ID that uniquely identifies an AST node in a file.
#[ruff_index::newtype_index]
pub struct AstId;
/// A typed ID that uniquely identifies an AST node in a file.
///
/// This is different from [`AstId`] in that it is a combination of ID and the type of the node the ID identifies.
/// Typing the ID prevents mixing IDs of different node types and allows to restrict the API to only accept
/// nodes for which an ID has been created (not all AST nodes get an ID).
pub struct TypedAstId<N: HasAstId> {
erased: AstId,
_marker: PhantomData<fn() -> N>,
}
impl<N: HasAstId> TypedAstId<N> {
/// Upcasts this ID from a more specific node type to a more general node type.
pub fn upcast<M: HasAstId>(self) -> TypedAstId<M>
where
N: Into<M>,
{
TypedAstId {
erased: self.erased,
_marker: PhantomData,
}
}
}
impl<N: HasAstId> Copy for TypedAstId<N> {}
impl<N: HasAstId> Clone for TypedAstId<N> {
fn clone(&self) -> Self {
*self
}
}
impl<N: HasAstId> PartialEq for TypedAstId<N> {
fn eq(&self, other: &Self) -> bool {
self.erased == other.erased
}
}
impl<N: HasAstId> Eq for TypedAstId<N> {}
impl<N: HasAstId> Hash for TypedAstId<N> {
fn hash<H: Hasher>(&self, state: &mut H) {
self.erased.hash(state);
}
}
impl<N: HasAstId> Debug for TypedAstId<N> {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_tuple("TypedAstId")
.field(&self.erased)
.field(&type_name::<N>())
.finish()
}
}
pub struct AstIds {
ids: IndexVec<AstId, NodeKey>,
reverse: FxHashMap<NodeKey, AstId>,
}
impl AstIds {
// TODO rust analyzer doesn't allocate an ID for every node. It only allocates ids for
// nodes with a corresponding HIR element, that is nodes that are definitions.
pub fn from_module(module: &ModModule) -> Self {
let mut visitor = AstIdsVisitor::default();
// TODO: visit_module?
// Make sure we visit the root
visitor.create_id(module);
visitor.visit_body(&module.body);
while let Some(deferred) = visitor.deferred.pop() {
match deferred {
DeferredNode::FunctionDefinition(def) => {
def.visit_preorder(&mut visitor);
}
DeferredNode::ClassDefinition(def) => def.visit_preorder(&mut visitor),
}
}
AstIds {
ids: visitor.ids,
reverse: visitor.reverse,
}
}
/// Returns the ID to the root node.
pub fn root(&self) -> NodeKey {
self.ids[AstId::new(0)]
}
/// Returns the [`TypedAstId`] for a node.
pub fn ast_id<N: HasAstId>(&self, node: &N) -> TypedAstId<N> {
let key = node.syntax_node_key();
TypedAstId {
erased: self.reverse.get(&key).copied().unwrap(),
_marker: PhantomData,
}
}
/// Returns the [`TypedAstId`] for the node identified with the given [`TypedNodeKey`].
pub fn ast_id_for_key<N: HasAstId>(&self, node: &TypedNodeKey<N>) -> TypedAstId<N> {
let ast_id = self.ast_id_for_node_key(node.inner);
TypedAstId {
erased: ast_id,
_marker: PhantomData,
}
}
/// Returns the untyped [`AstId`] for the node identified by the given `node` key.
pub fn ast_id_for_node_key(&self, node: NodeKey) -> AstId {
self.reverse
.get(&node)
.copied()
.expect("Can't find node in AstIds map.")
}
/// Returns the [`TypedNodeKey`] for the node identified by the given [`TypedAstId`].
pub fn key<N: HasAstId>(&self, id: TypedAstId<N>) -> TypedNodeKey<N> {
let syntax_key = self.ids[id.erased];
TypedNodeKey::new(syntax_key).unwrap()
}
pub fn node_key<H: HasAstId>(&self, id: TypedAstId<H>) -> NodeKey {
self.ids[id.erased]
}
}
impl std::fmt::Debug for AstIds {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
let mut map = f.debug_map();
for (key, value) in self.ids.iter_enumerated() {
map.entry(&key, &value);
}
map.finish()
}
}
impl PartialEq for AstIds {
fn eq(&self, other: &Self) -> bool {
self.ids == other.ids
}
}
impl Eq for AstIds {}
#[derive(Default)]
struct AstIdsVisitor<'a> {
ids: IndexVec<AstId, NodeKey>,
reverse: FxHashMap<NodeKey, AstId>,
deferred: Vec<DeferredNode<'a>>,
}
impl<'a> AstIdsVisitor<'a> {
fn create_id<A: HasAstId>(&mut self, node: &A) {
let node_key = node.syntax_node_key();
let id = self.ids.push(node_key);
self.reverse.insert(node_key, id);
}
}
impl<'a> PreorderVisitor<'a> for AstIdsVisitor<'a> {
fn visit_stmt(&mut self, stmt: &'a Stmt) {
match stmt {
Stmt::FunctionDef(def) => {
self.create_id(def);
self.deferred.push(DeferredNode::FunctionDefinition(def));
return;
}
// TODO defer visiting the assignment body, type alias parameters etc?
Stmt::ClassDef(def) => {
self.create_id(def);
self.deferred.push(DeferredNode::ClassDefinition(def));
return;
}
Stmt::Expr(_) => {
// Skip
return;
}
Stmt::Return(_) => {}
Stmt::Delete(_) => {}
Stmt::Assign(assignment) => self.create_id(assignment),
Stmt::AugAssign(assignment) => {
self.create_id(assignment);
}
Stmt::AnnAssign(assignment) => self.create_id(assignment),
Stmt::TypeAlias(assignment) => self.create_id(assignment),
Stmt::For(_) => {}
Stmt::While(_) => {}
Stmt::If(_) => {}
Stmt::With(_) => {}
Stmt::Match(_) => {}
Stmt::Raise(_) => {}
Stmt::Try(_) => {}
Stmt::Assert(_) => {}
Stmt::Import(import) => self.create_id(import),
Stmt::ImportFrom(import_from) => self.create_id(import_from),
Stmt::Global(global) => self.create_id(global),
Stmt::Nonlocal(non_local) => self.create_id(non_local),
Stmt::Pass(_) => {}
Stmt::Break(_) => {}
Stmt::Continue(_) => {}
Stmt::IpyEscapeCommand(_) => {}
}
preorder::walk_stmt(self, stmt);
}
fn visit_expr(&mut self, _expr: &'a Expr) {}
fn visit_parameter(&mut self, parameter: &'a Parameter) {
self.create_id(parameter);
preorder::walk_parameter(self, parameter);
}
fn visit_except_handler(&mut self, except_handler: &'a ExceptHandler) {
match except_handler {
ExceptHandler::ExceptHandler(except_handler) => {
self.create_id(except_handler);
}
}
preorder::walk_except_handler(self, except_handler);
}
fn visit_with_item(&mut self, with_item: &'a WithItem) {
self.create_id(with_item);
preorder::walk_with_item(self, with_item);
}
fn visit_match_case(&mut self, match_case: &'a MatchCase) {
self.create_id(match_case);
preorder::walk_match_case(self, match_case);
}
fn visit_type_param(&mut self, type_param: &'a TypeParam) {
self.create_id(type_param);
}
}
enum DeferredNode<'a> {
FunctionDefinition(&'a StmtFunctionDef),
ClassDefinition(&'a StmtClassDef),
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct TypedNodeKey<N: AstNode> {
/// The type erased node key.
inner: NodeKey,
_marker: PhantomData<fn() -> N>,
}
impl<N: AstNode> TypedNodeKey<N> {
pub fn from_node(node: &N) -> Self {
let inner = NodeKey {
kind: node.as_any_node_ref().kind(),
range: node.range(),
};
Self {
inner,
_marker: PhantomData,
}
}
pub fn new(node_key: NodeKey) -> Option<Self> {
N::can_cast(node_key.kind).then_some(TypedNodeKey {
inner: node_key,
_marker: PhantomData,
})
}
pub fn resolve<'a>(&self, root: AnyNodeRef<'a>) -> Option<N::Ref<'a>> {
let node_ref = self.inner.resolve(root)?;
Some(N::cast_ref(node_ref).unwrap())
}
pub fn resolve_unwrap<'a>(&self, root: AnyNodeRef<'a>) -> N::Ref<'a> {
self.resolve(root).expect("node should resolve")
}
pub fn erased(&self) -> &NodeKey {
&self.inner
}
}
struct FindNodeKeyVisitor<'a> {
key: NodeKey,
result: Option<AnyNodeRef<'a>>,
}
impl<'a> PreorderVisitor<'a> for FindNodeKeyVisitor<'a> {
fn enter_node(&mut self, node: AnyNodeRef<'a>) -> TraversalSignal {
if self.result.is_some() {
return TraversalSignal::Skip;
}
if node.range() == self.key.range && node.kind() == self.key.kind {
self.result = Some(node);
TraversalSignal::Skip
} else if node.range().contains_range(self.key.range) {
TraversalSignal::Traverse
} else {
TraversalSignal::Skip
}
}
fn visit_body(&mut self, body: &'a [Stmt]) {
// TODO it would be more efficient to use binary search instead of linear
for stmt in body {
if stmt.range().start() > self.key.range.end() {
break;
}
self.visit_stmt(stmt);
}
}
}
// TODO an alternative to this is to have a `NodeId` on each node (in increasing order depending on the position).
// This would allow to reduce the size of this to a u32.
// What would be nice if we could use an `Arc::weak_ref` here but that only works if we use
// `Arc` internally
// TODO: Implement the logic to resolve a node, given a db (and the correct file).
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct NodeKey {
kind: NodeKind,
range: TextRange,
}
impl NodeKey {
pub fn resolve<'a>(&self, root: AnyNodeRef<'a>) -> Option<AnyNodeRef<'a>> {
// We need to do a binary search here. Only traverse into a node if the range is withint the node
let mut visitor = FindNodeKeyVisitor {
key: *self,
result: None,
};
if visitor.enter_node(root) == TraversalSignal::Traverse {
root.visit_preorder(&mut visitor);
}
visitor.result
}
}
/// Marker trait implemented by AST nodes for which we extract the `AstId`.
pub trait HasAstId: AstNode {
fn node_key(&self) -> TypedNodeKey<Self>
where
Self: Sized,
{
TypedNodeKey {
inner: self.syntax_node_key(),
_marker: PhantomData,
}
}
fn syntax_node_key(&self) -> NodeKey {
NodeKey {
kind: self.as_any_node_ref().kind(),
range: self.range(),
}
}
}
impl HasAstId for StmtFunctionDef {}
impl HasAstId for StmtClassDef {}
impl HasAstId for StmtAnnAssign {}
impl HasAstId for StmtAugAssign {}
impl HasAstId for StmtAssign {}
impl HasAstId for StmtTypeAlias {}
impl HasAstId for ModModule {}
impl HasAstId for StmtImport {}
impl HasAstId for StmtImportFrom {}
impl HasAstId for Parameter {}
impl HasAstId for TypeParam {}
impl HasAstId for Stmt {}
impl HasAstId for TypeParamTypeVar {}
impl HasAstId for TypeParamTypeVarTuple {}
impl HasAstId for TypeParamParamSpec {}
impl HasAstId for StmtGlobal {}
impl HasAstId for StmtNonlocal {}
impl HasAstId for ExceptHandlerExceptHandler {}
impl HasAstId for WithItem {}
impl HasAstId for MatchCase {}

View File

@@ -0,0 +1,165 @@
use std::fmt::Formatter;
use std::hash::Hash;
use std::sync::atomic::{AtomicUsize, Ordering};
use crate::db::QueryResult;
use dashmap::mapref::entry::Entry;
use crate::FxDashMap;
/// Simple key value cache that locks on a per-key level.
pub struct KeyValueCache<K, V> {
map: FxDashMap<K, V>,
statistics: CacheStatistics,
}
impl<K, V> KeyValueCache<K, V>
where
K: Eq + Hash + Clone,
V: Clone,
{
pub fn try_get(&self, key: &K) -> Option<V> {
if let Some(existing) = self.map.get(key) {
self.statistics.hit();
Some(existing.clone())
} else {
self.statistics.miss();
None
}
}
pub fn get<F>(&self, key: &K, compute: F) -> QueryResult<V>
where
F: FnOnce(&K) -> QueryResult<V>,
{
Ok(match self.map.entry(key.clone()) {
Entry::Occupied(cached) => {
self.statistics.hit();
cached.get().clone()
}
Entry::Vacant(vacant) => {
self.statistics.miss();
let value = compute(key)?;
vacant.insert(value.clone());
value
}
})
}
pub fn set(&mut self, key: K, value: V) {
self.map.insert(key, value);
}
pub fn remove(&mut self, key: &K) -> Option<V> {
self.map.remove(key).map(|(_, value)| value)
}
pub fn clear(&mut self) {
self.map.clear();
self.map.shrink_to_fit();
}
pub fn statistics(&self) -> Option<Statistics> {
self.statistics.to_statistics()
}
}
impl<K, V> Default for KeyValueCache<K, V>
where
K: Eq + Hash,
V: Clone,
{
fn default() -> Self {
Self {
map: FxDashMap::default(),
statistics: CacheStatistics::default(),
}
}
}
impl<K, V> std::fmt::Debug for KeyValueCache<K, V>
where
K: std::fmt::Debug + Eq + Hash,
V: std::fmt::Debug,
{
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
let mut debug = f.debug_map();
for entry in &self.map {
debug.entry(&entry.value(), &entry.key());
}
debug.finish()
}
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Statistics {
pub hits: usize,
pub misses: usize,
}
impl Statistics {
#[allow(clippy::cast_precision_loss)]
pub fn hit_rate(&self) -> Option<f64> {
if self.hits + self.misses == 0 {
return None;
}
Some((self.hits as f64) / (self.hits + self.misses) as f64)
}
}
#[cfg(debug_assertions)]
pub type CacheStatistics = DebugStatistics;
#[cfg(not(debug_assertions))]
pub type CacheStatistics = ReleaseStatistics;
pub trait StatisticsRecorder {
fn hit(&self);
fn miss(&self);
fn to_statistics(&self) -> Option<Statistics>;
}
#[derive(Debug, Default)]
pub struct DebugStatistics {
hits: AtomicUsize,
misses: AtomicUsize,
}
impl StatisticsRecorder for DebugStatistics {
// TODO figure out appropriate Ordering
fn hit(&self) {
self.hits.fetch_add(1, Ordering::SeqCst);
}
fn miss(&self) {
self.misses.fetch_add(1, Ordering::SeqCst);
}
fn to_statistics(&self) -> Option<Statistics> {
let hits = self.hits.load(Ordering::SeqCst);
let misses = self.misses.load(Ordering::SeqCst);
Some(Statistics { hits, misses })
}
}
#[derive(Debug, Default)]
pub struct ReleaseStatistics;
impl StatisticsRecorder for ReleaseStatistics {
#[inline]
fn hit(&self) {}
#[inline]
fn miss(&self) {}
#[inline]
fn to_statistics(&self) -> Option<Statistics> {
None
}
}

View File

@@ -0,0 +1,42 @@
use std::sync::atomic::AtomicBool;
use std::sync::Arc;
#[derive(Debug, Clone, Default)]
pub struct CancellationTokenSource {
signal: Arc<AtomicBool>,
}
impl CancellationTokenSource {
pub fn new() -> Self {
Self {
signal: Arc::new(AtomicBool::new(false)),
}
}
#[tracing::instrument(level = "trace", skip_all)]
pub fn cancel(&self) {
self.signal.store(true, std::sync::atomic::Ordering::SeqCst);
}
pub fn is_cancelled(&self) -> bool {
self.signal.load(std::sync::atomic::Ordering::SeqCst)
}
pub fn token(&self) -> CancellationToken {
CancellationToken {
signal: self.signal.clone(),
}
}
}
#[derive(Clone, Debug)]
pub struct CancellationToken {
signal: Arc<AtomicBool>,
}
impl CancellationToken {
/// Returns `true` if cancellation has been requested.
pub fn is_cancelled(&self) -> bool {
self.signal.load(std::sync::atomic::Ordering::SeqCst)
}
}

296
crates/red_knot/src/db.rs Normal file
View File

@@ -0,0 +1,296 @@
use std::path::Path;
use std::sync::Arc;
pub use jars::{HasJar, HasJars};
pub use query::{QueryError, QueryResult};
pub use runtime::DbRuntime;
pub use storage::JarsStorage;
use crate::files::FileId;
use crate::lint::{Diagnostics, LintSemanticStorage, LintSyntaxStorage};
use crate::module::{Module, ModuleData, ModuleName, ModuleResolver, ModuleSearchPath};
use crate::parse::{Parsed, ParsedStorage};
use crate::source::{Source, SourceStorage};
use crate::symbols::{SymbolId, SymbolTable, SymbolTablesStorage};
use crate::types::{Type, TypeStore};
mod jars;
mod query;
mod runtime;
mod storage;
pub trait Database {
/// Returns a reference to the runtime of the current worker.
fn runtime(&self) -> &DbRuntime;
/// Returns a mutable reference to the runtime. Only one worker can hold a mutable reference to the runtime.
fn runtime_mut(&mut self) -> &mut DbRuntime;
/// Returns `Ok` if the queries have not been cancelled and `Err(QueryError::Cancelled)` otherwise.
fn cancelled(&self) -> QueryResult<()> {
self.runtime().cancelled()
}
/// Returns `true` if the queries have been cancelled.
fn is_cancelled(&self) -> bool {
self.runtime().is_cancelled()
}
}
/// Database that supports running queries from multiple threads.
pub trait ParallelDatabase: Database + Send {
/// Creates a snapshot of the database state that can be used to query the database in another thread.
///
/// The snapshot is a read-only view of the database but query results are shared between threads.
/// All queries will be automatically cancelled when applying any mutations (calling [`HasJars::jars_mut`])
/// to the database (not the snapshot, because they're readonly).
///
/// ## Creating a snapshot
///
/// Creating a snapshot of the database's jars is cheap but creating a snapshot of
/// other state stored on the database might require deep-cloning data. That's why you should
/// avoid creating snapshots in a hot function (e.g. don't create a snapshot for each file, instead
/// create a snapshot when scheduling the check of an entire program).
///
/// ## Salsa compatibility
/// Salsa prohibits creating a snapshot while running a local query (it's fine if other workers run a query) [[source](https://github.com/salsa-rs/salsa/issues/80)].
/// We should avoid creating snapshots while running a query because we might want to adopt Salsa in the future (if we can figure out persistent caching).
/// Unfortunately, the infrastructure doesn't provide an automated way of knowing when a query is run, that's
/// why we have to "enforce" this constraint manually.
fn snapshot(&self) -> Snapshot<Self>;
}
/// Readonly snapshot of a database.
///
/// ## Dead locks
/// A snapshot should always be dropped as soon as it is no longer necessary to run queries.
/// Storing the snapshot without running a query or periodically checking if cancellation was requested
/// can lead to deadlocks because mutating the [`Database`] requires cancels all pending queries
/// and waiting for all [`Snapshot`]s to be dropped.
#[derive(Debug)]
pub struct Snapshot<DB: ?Sized>
where
DB: ParallelDatabase,
{
db: DB,
}
impl<DB> Snapshot<DB>
where
DB: ParallelDatabase,
{
pub fn new(db: DB) -> Self {
Snapshot { db }
}
}
impl<DB> std::ops::Deref for Snapshot<DB>
where
DB: ParallelDatabase,
{
type Target = DB;
fn deref(&self) -> &DB {
&self.db
}
}
// Red knot specific databases code.
pub trait SourceDb: Database {
// queries
fn file_id(&self, path: &std::path::Path) -> FileId;
fn file_path(&self, file_id: FileId) -> Arc<std::path::Path>;
fn source(&self, file_id: FileId) -> QueryResult<Source>;
fn parse(&self, file_id: FileId) -> QueryResult<Parsed>;
}
pub trait SemanticDb: SourceDb {
// queries
fn resolve_module(&self, name: ModuleName) -> QueryResult<Option<Module>>;
fn file_to_module(&self, file_id: FileId) -> QueryResult<Option<Module>>;
fn path_to_module(&self, path: &Path) -> QueryResult<Option<Module>>;
fn symbol_table(&self, file_id: FileId) -> QueryResult<Arc<SymbolTable>>;
fn infer_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> QueryResult<Type>;
// mutations
fn add_module(&mut self, path: &Path) -> Option<(Module, Vec<Arc<ModuleData>>)>;
fn set_module_search_paths(&mut self, paths: Vec<ModuleSearchPath>);
}
pub trait LintDb: SemanticDb {
fn lint_syntax(&self, file_id: FileId) -> QueryResult<Diagnostics>;
fn lint_semantic(&self, file_id: FileId) -> QueryResult<Diagnostics>;
}
pub trait Db: LintDb {}
#[derive(Debug, Default)]
pub struct SourceJar {
pub sources: SourceStorage,
pub parsed: ParsedStorage,
}
#[derive(Debug, Default)]
pub struct SemanticJar {
pub module_resolver: ModuleResolver,
pub symbol_tables: SymbolTablesStorage,
pub type_store: TypeStore,
}
#[derive(Debug, Default)]
pub struct LintJar {
pub lint_syntax: LintSyntaxStorage,
pub lint_semantic: LintSemanticStorage,
}
#[cfg(test)]
pub(crate) mod tests {
use std::path::Path;
use std::sync::Arc;
use crate::db::{
Database, DbRuntime, HasJar, HasJars, JarsStorage, LintDb, LintJar, QueryResult, SourceDb,
SourceJar,
};
use crate::files::{FileId, Files};
use crate::lint::{lint_semantic, lint_syntax, Diagnostics};
use crate::module::{
add_module, file_to_module, path_to_module, resolve_module, set_module_search_paths,
Module, ModuleData, ModuleName, ModuleSearchPath,
};
use crate::parse::{parse, Parsed};
use crate::source::{source_text, Source};
use crate::symbols::{symbol_table, SymbolId, SymbolTable};
use crate::types::{infer_symbol_type, Type};
use super::{SemanticDb, SemanticJar};
// This can be a partial database used in a single crate for testing.
// It would hold fewer data than the full database.
#[derive(Debug, Default)]
pub(crate) struct TestDb {
files: Files,
jars: JarsStorage<Self>,
}
impl HasJar<SourceJar> for TestDb {
fn jar(&self) -> QueryResult<&SourceJar> {
Ok(&self.jars()?.0)
}
fn jar_mut(&mut self) -> &mut SourceJar {
&mut self.jars_mut().0
}
}
impl HasJar<SemanticJar> for TestDb {
fn jar(&self) -> QueryResult<&SemanticJar> {
Ok(&self.jars()?.1)
}
fn jar_mut(&mut self) -> &mut SemanticJar {
&mut self.jars_mut().1
}
}
impl HasJar<LintJar> for TestDb {
fn jar(&self) -> QueryResult<&LintJar> {
Ok(&self.jars()?.2)
}
fn jar_mut(&mut self) -> &mut LintJar {
&mut self.jars_mut().2
}
}
impl SourceDb for TestDb {
fn file_id(&self, path: &Path) -> FileId {
self.files.intern(path)
}
fn file_path(&self, file_id: FileId) -> Arc<Path> {
self.files.path(file_id)
}
fn source(&self, file_id: FileId) -> QueryResult<Source> {
source_text(self, file_id)
}
fn parse(&self, file_id: FileId) -> QueryResult<Parsed> {
parse(self, file_id)
}
}
impl SemanticDb for TestDb {
fn resolve_module(&self, name: ModuleName) -> QueryResult<Option<Module>> {
resolve_module(self, name)
}
fn file_to_module(&self, file_id: FileId) -> QueryResult<Option<Module>> {
file_to_module(self, file_id)
}
fn path_to_module(&self, path: &Path) -> QueryResult<Option<Module>> {
path_to_module(self, path)
}
fn symbol_table(&self, file_id: FileId) -> QueryResult<Arc<SymbolTable>> {
symbol_table(self, file_id)
}
fn infer_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> QueryResult<Type> {
infer_symbol_type(self, file_id, symbol_id)
}
fn add_module(&mut self, path: &Path) -> Option<(Module, Vec<Arc<ModuleData>>)> {
add_module(self, path)
}
fn set_module_search_paths(&mut self, paths: Vec<ModuleSearchPath>) {
set_module_search_paths(self, paths);
}
}
impl LintDb for TestDb {
fn lint_syntax(&self, file_id: FileId) -> QueryResult<Diagnostics> {
lint_syntax(self, file_id)
}
fn lint_semantic(&self, file_id: FileId) -> QueryResult<Diagnostics> {
lint_semantic(self, file_id)
}
}
impl HasJars for TestDb {
type Jars = (SourceJar, SemanticJar, LintJar);
fn jars(&self) -> QueryResult<&Self::Jars> {
self.jars.jars()
}
fn jars_mut(&mut self) -> &mut Self::Jars {
self.jars.jars_mut()
}
}
impl Database for TestDb {
fn runtime(&self) -> &DbRuntime {
self.jars.runtime()
}
fn runtime_mut(&mut self) -> &mut DbRuntime {
self.jars.runtime_mut()
}
}
}

View File

@@ -0,0 +1,37 @@
use crate::db::query::QueryResult;
/// Gives access to a specific jar in the database.
///
/// Nope, the terminology isn't borrowed from Java but from Salsa <https://salsa-rs.github.io/salsa/>,
/// which is an analogy to storing the salsa in different jars.
///
/// The basic idea is that each crate can define its own jar and the jars can be combined to a single
/// database in the top level crate. Each crate also defines its own `Database` trait. The combination of
/// `Database` trait and the jar allows to write queries in isolation without having to know how they get composed at the upper levels.
///
/// Salsa further defines a `HasIngredient` trait which slices the jar to a specific storage (e.g. a specific cache).
/// We don't need this just jet because we write our queries by hand. We may want a similar trait if we decide
/// to use a macro to generate the queries.
pub trait HasJar<T> {
/// Gives a read-only reference to the jar.
fn jar(&self) -> QueryResult<&T>;
/// Gives a mutable reference to the jar.
fn jar_mut(&mut self) -> &mut T;
}
/// Gives access to the jars in a database.
pub trait HasJars {
/// A type storing the jars.
///
/// Most commonly, this is a tuple where each jar is a tuple element.
type Jars: Default;
/// Gives access to the underlying jars but tests if the queries have been cancelled.
///
/// Returns `Err(QueryError::Cancelled)` if the queries have been cancelled.
fn jars(&self) -> QueryResult<&Self::Jars>;
/// Gives mutable access to the underlying jars.
fn jars_mut(&mut self) -> &mut Self::Jars;
}

View File

@@ -0,0 +1,20 @@
use std::fmt::{Display, Formatter};
/// Reason why a db query operation failed.
#[derive(Debug, Clone, Copy)]
pub enum QueryError {
/// The query was cancelled because the DB was mutated or the query was cancelled by the host (e.g. on a file change or when pressing CTRL+C).
Cancelled,
}
impl Display for QueryError {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
match self {
QueryError::Cancelled => f.write_str("query was cancelled"),
}
}
}
impl std::error::Error for QueryError {}
pub type QueryResult<T> = Result<T, QueryError>;

View File

@@ -0,0 +1,41 @@
use crate::cancellation::CancellationTokenSource;
use crate::db::{QueryError, QueryResult};
/// Holds the jar agnostic state of the database.
#[derive(Debug, Default)]
pub struct DbRuntime {
/// The cancellation token source used to signal other works that the queries should be aborted and
/// exit at the next possible point.
cancellation_token: CancellationTokenSource,
}
impl DbRuntime {
pub(super) fn snapshot(&self) -> Self {
Self {
cancellation_token: self.cancellation_token.clone(),
}
}
/// Cancels the pending queries of other workers. The current worker cannot have any pending
/// queries because we're holding a mutable reference to the runtime.
pub(super) fn cancel_other_workers(&mut self) {
self.cancellation_token.cancel();
// Set a new cancellation token so that we're in a non-cancelled state again when running the next
// query.
self.cancellation_token = CancellationTokenSource::default();
}
/// Returns `Ok` if the queries have not been cancelled and `Err(QueryError::Cancelled)` otherwise.
pub(super) fn cancelled(&self) -> QueryResult<()> {
if self.cancellation_token.is_cancelled() {
Err(QueryError::Cancelled)
} else {
Ok(())
}
}
/// Returns `true` if the queries have been cancelled.
pub(super) fn is_cancelled(&self) -> bool {
self.cancellation_token.is_cancelled()
}
}

View File

@@ -0,0 +1,117 @@
use std::fmt::Formatter;
use std::sync::Arc;
use crossbeam::sync::WaitGroup;
use crate::db::query::QueryResult;
use crate::db::runtime::DbRuntime;
use crate::db::{HasJars, ParallelDatabase};
/// Stores the jars of a database and the state for each worker.
///
/// Today, all state is shared across all workers, but it may be desired to store data per worker in the future.
pub struct JarsStorage<T>
where
T: HasJars + Sized,
{
// It's important that `jars_wait_group` is declared after `jars` to ensure that `jars` is dropped first.
// See https://doc.rust-lang.org/reference/destructors.html
/// Stores the jars of the database.
jars: Arc<T::Jars>,
/// Used to count the references to `jars`. Allows implementing `jars_mut` without requiring to clone `jars`.
jars_wait_group: WaitGroup,
/// The data agnostic state.
runtime: DbRuntime,
}
impl<Db> JarsStorage<Db>
where
Db: HasJars,
{
pub(super) fn new() -> Self {
Self {
jars: Arc::new(Db::Jars::default()),
jars_wait_group: WaitGroup::default(),
runtime: DbRuntime::default(),
}
}
/// Creates a snapshot of the jars.
///
/// Creating the snapshot is cheap because it doesn't clone the jars, it only increments a ref counter.
#[must_use]
pub fn snapshot(&self) -> JarsStorage<Db>
where
Db: ParallelDatabase,
{
Self {
jars: self.jars.clone(),
jars_wait_group: self.jars_wait_group.clone(),
runtime: self.runtime.snapshot(),
}
}
pub(crate) fn jars(&self) -> QueryResult<&Db::Jars> {
self.runtime.cancelled()?;
Ok(&self.jars)
}
/// Returns a mutable reference to the jars without cloning their content.
///
/// The method cancels any pending queries of other works and waits for them to complete so that
/// this instance is the only instance holding a reference to the jars.
pub(crate) fn jars_mut(&mut self) -> &mut Db::Jars {
// We have a mutable ref here, so no more workers can be spawned between calling this function and taking the mut ref below.
self.cancel_other_workers();
// Now all other references to `self.jars` should have been released. We can now safely return a mutable reference
// to the Arc's content.
let jars =
Arc::get_mut(&mut self.jars).expect("All references to jars should have been released");
jars
}
pub(crate) fn runtime(&self) -> &DbRuntime {
&self.runtime
}
pub(crate) fn runtime_mut(&mut self) -> &mut DbRuntime {
// Note: This method may need to use a similar trick to `jars_mut` if `DbRuntime` is ever to store data that is shared between workers.
&mut self.runtime
}
#[tracing::instrument(level = "trace", skip(self))]
fn cancel_other_workers(&mut self) {
self.runtime.cancel_other_workers();
// Wait for all other works to complete.
let existing_wait = std::mem::take(&mut self.jars_wait_group);
existing_wait.wait();
}
}
impl<Db> Default for JarsStorage<Db>
where
Db: HasJars,
{
fn default() -> Self {
Self::new()
}
}
impl<T> std::fmt::Debug for JarsStorage<T>
where
T: HasJars,
<T as HasJars>::Jars: std::fmt::Debug,
{
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("SharedStorage")
.field("jars", &self.jars)
.field("jars_wait_group", &self.jars_wait_group)
.field("runtime", &self.runtime)
.finish()
}
}

View File

@@ -0,0 +1,148 @@
use std::fmt::{Debug, Formatter};
use std::hash::{Hash, Hasher};
use std::path::Path;
use std::sync::Arc;
use hashbrown::hash_map::RawEntryMut;
use parking_lot::RwLock;
use rustc_hash::FxHasher;
use ruff_index::{newtype_index, IndexVec};
type Map<K, V> = hashbrown::HashMap<K, V, ()>;
#[newtype_index]
pub struct FileId;
// TODO we'll need a higher level virtual file system abstraction that allows testing if a file exists
// or retrieving its content (ideally lazily and in a way that the memory can be retained later)
// I suspect that we'll end up with a FileSystem trait and our own Path abstraction.
#[derive(Clone, Default)]
pub struct Files {
inner: Arc<RwLock<FilesInner>>,
}
impl Files {
#[tracing::instrument(level = "debug", skip(self))]
pub fn intern(&self, path: &Path) -> FileId {
self.inner.write().intern(path)
}
pub fn try_get(&self, path: &Path) -> Option<FileId> {
self.inner.read().try_get(path)
}
#[tracing::instrument(level = "debug", skip(self))]
pub fn path(&self, id: FileId) -> Arc<Path> {
self.inner.read().path(id)
}
}
impl Debug for Files {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
let files = self.inner.read();
let mut debug = f.debug_map();
for item in files.iter() {
debug.entry(&item.0, &item.1);
}
debug.finish()
}
}
impl PartialEq for Files {
fn eq(&self, other: &Self) -> bool {
self.inner.read().eq(&other.inner.read())
}
}
impl Eq for Files {}
#[derive(Default)]
struct FilesInner {
by_path: Map<FileId, ()>,
// TODO should we use a map here to reclaim the space for removed files?
// TODO I think we should use our own path abstraction here to avoid having to normalize paths
// and dealing with non-utf paths everywhere.
by_id: IndexVec<FileId, Arc<Path>>,
}
impl FilesInner {
/// Inserts the path and returns a new id for it or returns the id if it is an existing path.
// TODO should this accept Path or PathBuf?
pub(crate) fn intern(&mut self, path: &Path) -> FileId {
let mut hasher = FxHasher::default();
path.hash(&mut hasher);
let hash = hasher.finish();
let entry = self
.by_path
.raw_entry_mut()
.from_hash(hash, |existing_file| &*self.by_id[*existing_file] == path);
match entry {
RawEntryMut::Occupied(entry) => *entry.key(),
RawEntryMut::Vacant(entry) => {
let id = self.by_id.push(Arc::from(path));
entry.insert_with_hasher(hash, id, (), |_| hash);
id
}
}
}
pub(crate) fn try_get(&self, path: &Path) -> Option<FileId> {
let mut hasher = FxHasher::default();
path.hash(&mut hasher);
let hash = hasher.finish();
Some(
*self
.by_path
.raw_entry()
.from_hash(hash, |existing_file| &*self.by_id[*existing_file] == path)?
.0,
)
}
/// Returns the path for the file with the given id.
pub(crate) fn path(&self, id: FileId) -> Arc<Path> {
self.by_id[id].clone()
}
pub(crate) fn iter(&self) -> impl Iterator<Item = (FileId, Arc<Path>)> + '_ {
self.by_path.keys().map(|id| (*id, self.by_id[*id].clone()))
}
}
impl PartialEq for FilesInner {
fn eq(&self, other: &Self) -> bool {
self.by_id == other.by_id
}
}
impl Eq for FilesInner {}
#[cfg(test)]
mod tests {
use super::*;
use std::path::PathBuf;
#[test]
fn insert_path_twice_same_id() {
let files = Files::default();
let path = PathBuf::from("foo/bar");
let id1 = files.intern(&path);
let id2 = files.intern(&path);
assert_eq!(id1, id2);
}
#[test]
fn insert_different_paths_different_ids() {
let files = Files::default();
let path1 = PathBuf::from("foo/bar");
let path2 = PathBuf::from("foo/bar/baz");
let id1 = files.intern(&path1);
let id2 = files.intern(&path2);
assert_ne!(id1, id2);
}
}

View File

@@ -0,0 +1,135 @@
use std::ops::{Deref, DerefMut};
use ruff_formatter::PrintedRange;
use ruff_python_formatter::{FormatModuleError, PyFormatOptions};
use ruff_text_size::TextRange;
use crate::cache::KeyValueCache;
use crate::db::{HasJar, QueryError, SourceDb};
use crate::files::FileId;
use crate::lint::Diagnostics;
use crate::FxDashSet;
pub(crate) trait FormatDb: SourceDb {
/// Formats a file and returns its formatted content or an indicator that it is unchanged.
fn format_file(&self, file_id: FileId) -> Result<FormattedFile, FormatError>;
/// Formats a range in a file.
fn format_file_range(
&self,
file_id: FileId,
range: TextRange,
) -> Result<PrintedRange, FormatError>;
fn check_file_formatted(&self, file_id: FileId) -> Result<Diagnostics, FormatError>;
}
#[tracing::instrument(level = "trace", skip(db))]
pub(crate) fn format_file<Db>(db: &Db, file_id: FileId) -> Result<FormattedFile, FormatError>
where
Db: FormatDb + HasJar<FormatJar>,
{
let formatted = &db.jar()?.formatted;
if formatted.contains(&file_id) {
return Ok(FormattedFile::Unchanged);
}
let source = db.source(file_id)?;
// TODO use the `format_module` method here to re-use the AST.
let printed =
ruff_python_formatter::format_module_source(source.text(), PyFormatOptions::default())?;
Ok(if printed.as_code() == source.text() {
formatted.insert(file_id);
FormattedFile::Unchanged
} else {
FormattedFile::Formatted(printed.into_code())
})
}
#[tracing::instrument(level = "trace", skip(db))]
pub(crate) fn format_file_range<Db: FormatDb + HasJar<FormatJar>>(
db: &Db,
file_id: FileId,
range: TextRange,
) -> Result<PrintedRange, FormatError> {
let formatted = &db.jar()?.formatted;
let source = db.source(file_id)?;
if formatted.contains(&file_id) {
return Ok(PrintedRange::new(source.text()[range].into(), range));
}
// TODO use the `format_module` method here to re-use the AST.
let result =
ruff_python_formatter::format_range(source.text(), range, PyFormatOptions::default())?;
Ok(result)
}
/// Checks if the file is correctly formatted. It creates a diagnostic for formatting issues.
#[tracing::instrument(level = "trace", skip(db))]
pub(crate) fn check_formatted<Db>(db: &Db, file_id: FileId) -> Result<Diagnostics, FormatError>
where
Db: FormatDb + HasJar<FormatJar>,
{
Ok(if db.format_file(file_id)?.is_unchanged() {
Diagnostics::Empty
} else {
Diagnostics::from(vec!["File is not formatted".to_string()])
})
}
#[derive(Debug)]
pub(crate) enum FormatError {
Format(FormatModuleError),
Query(QueryError),
}
impl From<FormatModuleError> for FormatError {
fn from(value: FormatModuleError) -> Self {
Self::Format(value)
}
}
impl From<QueryError> for FormatError {
fn from(value: QueryError) -> Self {
Self::Query(value)
}
}
#[derive(Clone, Eq, PartialEq, Debug)]
pub(crate) enum FormattedFile {
Formatted(String),
Unchanged,
}
impl FormattedFile {
pub(crate) const fn is_unchanged(&self) -> bool {
matches!(self, FormattedFile::Unchanged)
}
}
#[derive(Debug, Default)]
pub struct FormatJar {
pub formatted: FxDashSet<FileId>,
}
#[derive(Default, Debug)]
pub(crate) struct FormattedStorage(KeyValueCache<FileId, ()>);
impl Deref for FormattedStorage {
type Target = KeyValueCache<FileId, ()>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for FormattedStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}

View File

@@ -0,0 +1,67 @@
//! Key observations
//!
//! The HIR avoids allocations to large extends by:
//! * Using an arena per node type
//! * using ids and id ranges to reference items.
//!
//! Using separate arena per node type has the advantage that the IDs are relatively stable, because
//! they only change when a node of the same kind has been added or removed. (What's unclear is if that matters or if
//! it still triggers a re-compute because the AST-id in the node has changed).
//!
//! The HIR does not store all details. It mainly stores the *public* interface. There's a reference
//! back to the AST node to get more details.
//!
//!
use crate::ast_ids::{HasAstId, TypedAstId};
use crate::files::FileId;
use std::fmt::Formatter;
use std::hash::{Hash, Hasher};
pub struct HirAstId<N: HasAstId> {
file_id: FileId,
node_id: TypedAstId<N>,
}
impl<N: HasAstId> Copy for HirAstId<N> {}
impl<N: HasAstId> Clone for HirAstId<N> {
fn clone(&self) -> Self {
*self
}
}
impl<N: HasAstId> PartialEq for HirAstId<N> {
fn eq(&self, other: &Self) -> bool {
self.file_id == other.file_id && self.node_id == other.node_id
}
}
impl<N: HasAstId> Eq for HirAstId<N> {}
impl<N: HasAstId> std::fmt::Debug for HirAstId<N> {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("HirAstId")
.field("file_id", &self.file_id)
.field("node_id", &self.node_id)
.finish()
}
}
impl<N: HasAstId> Hash for HirAstId<N> {
fn hash<H: Hasher>(&self, state: &mut H) {
self.file_id.hash(state);
self.node_id.hash(state);
}
}
impl<N: HasAstId> HirAstId<N> {
pub fn upcast<M: HasAstId>(self) -> HirAstId<M>
where
N: Into<M>,
{
HirAstId {
file_id: self.file_id,
node_id: self.node_id.upcast(),
}
}
}

View File

@@ -0,0 +1,556 @@
use std::ops::{Index, Range};
use ruff_index::{newtype_index, IndexVec};
use ruff_python_ast::visitor::preorder;
use ruff_python_ast::visitor::preorder::PreorderVisitor;
use ruff_python_ast::{
Decorator, ExceptHandler, ExceptHandlerExceptHandler, Expr, MatchCase, ModModule, Stmt,
StmtAnnAssign, StmtAssign, StmtClassDef, StmtFunctionDef, StmtGlobal, StmtImport,
StmtImportFrom, StmtNonlocal, StmtTypeAlias, TypeParam, TypeParamParamSpec, TypeParamTypeVar,
TypeParamTypeVarTuple, WithItem,
};
use crate::ast_ids::{AstIds, HasAstId};
use crate::files::FileId;
use crate::hir::HirAstId;
use crate::Name;
#[newtype_index]
pub struct FunctionId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Function {
ast_id: HirAstId<StmtFunctionDef>,
name: Name,
parameters: Range<ParameterId>,
type_parameters: Range<TypeParameterId>, // TODO: type_parameters, return expression, decorators
}
#[newtype_index]
pub struct ParameterId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Parameter {
kind: ParameterKind,
name: Name,
default: Option<()>, // TODO use expression HIR
ast_id: HirAstId<ruff_python_ast::Parameter>,
}
// TODO or should `Parameter` be an enum?
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub enum ParameterKind {
PositionalOnly,
Arguments,
Vararg,
KeywordOnly,
Kwarg,
}
#[newtype_index]
pub struct ClassId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Class {
name: Name,
ast_id: HirAstId<StmtClassDef>,
// TODO type parameters, inheritance, decorators, members
}
#[newtype_index]
pub struct AssignmentId;
// This can have more than one name...
// but that means we can't implement `name()` on `ModuleItem`.
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Assignment {
// TODO: Handle multiple names / targets
name: Name,
ast_id: HirAstId<StmtAssign>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct AnnotatedAssignment {
name: Name,
ast_id: HirAstId<StmtAnnAssign>,
}
#[newtype_index]
pub struct AnnotatedAssignmentId;
#[newtype_index]
pub struct TypeAliasId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeAlias {
name: Name,
ast_id: HirAstId<StmtTypeAlias>,
parameters: Range<TypeParameterId>,
}
#[newtype_index]
pub struct TypeParameterId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub enum TypeParameter {
TypeVar(TypeParameterTypeVar),
ParamSpec(TypeParameterParamSpec),
TypeVarTuple(TypeParameterTypeVarTuple),
}
impl TypeParameter {
pub fn ast_id(&self) -> HirAstId<TypeParam> {
match self {
TypeParameter::TypeVar(type_var) => type_var.ast_id.upcast(),
TypeParameter::ParamSpec(param_spec) => param_spec.ast_id.upcast(),
TypeParameter::TypeVarTuple(type_var_tuple) => type_var_tuple.ast_id.upcast(),
}
}
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeParameterTypeVar {
name: Name,
ast_id: HirAstId<TypeParamTypeVar>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeParameterParamSpec {
name: Name,
ast_id: HirAstId<TypeParamParamSpec>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeParameterTypeVarTuple {
name: Name,
ast_id: HirAstId<TypeParamTypeVarTuple>,
}
#[newtype_index]
pub struct GlobalId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Global {
// TODO track names
ast_id: HirAstId<StmtGlobal>,
}
#[newtype_index]
pub struct NonLocalId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct NonLocal {
// TODO track names
ast_id: HirAstId<StmtNonlocal>,
}
pub enum DefinitionId {
Function(FunctionId),
Parameter(ParameterId),
Class(ClassId),
Assignment(AssignmentId),
AnnotatedAssignment(AnnotatedAssignmentId),
Global(GlobalId),
NonLocal(NonLocalId),
TypeParameter(TypeParameterId),
TypeAlias(TypeAlias),
}
pub enum DefinitionItem {
Function(Function),
Parameter(Parameter),
Class(Class),
Assignment(Assignment),
AnnotatedAssignment(AnnotatedAssignment),
Global(Global),
NonLocal(NonLocal),
TypeParameter(TypeParameter),
TypeAlias(TypeAlias),
}
// The closest is rust-analyzers item-tree. It only represents "Items" which make the public interface of a module
// (it excludes any other statement or expressions). rust-analyzer uses it as the main input to the name resolution
// algorithm
// > It is the input to the name resolution algorithm, as well as to the queries defined in `adt.rs`,
// > `data.rs`, and most things in `attr.rs`.
//
// > One important purpose of this layer is to provide an "invalidation barrier" for incremental
// > computations: when typing inside an item body, the `ItemTree` of the modified file is typically
// > unaffected, so we don't have to recompute name resolution results or item data (see `data.rs`).
//
// I haven't fully figured this out but I think that this composes the "public" interface of a module?
// But maybe that's too optimistic.
//
//
#[derive(Debug, Clone, Default, Eq, PartialEq)]
pub struct Definitions {
functions: IndexVec<FunctionId, Function>,
parameters: IndexVec<ParameterId, Parameter>,
classes: IndexVec<ClassId, Class>,
assignments: IndexVec<AssignmentId, Assignment>,
annotated_assignments: IndexVec<AnnotatedAssignmentId, AnnotatedAssignment>,
type_aliases: IndexVec<TypeAliasId, TypeAlias>,
type_parameters: IndexVec<TypeParameterId, TypeParameter>,
globals: IndexVec<GlobalId, Global>,
non_locals: IndexVec<NonLocalId, NonLocal>,
}
impl Definitions {
pub fn from_module(module: &ModModule, ast_ids: &AstIds, file_id: FileId) -> Self {
let mut visitor = DefinitionsVisitor {
definitions: Definitions::default(),
ast_ids,
file_id,
};
visitor.visit_body(&module.body);
visitor.definitions
}
}
impl Index<FunctionId> for Definitions {
type Output = Function;
fn index(&self, index: FunctionId) -> &Self::Output {
&self.functions[index]
}
}
impl Index<ParameterId> for Definitions {
type Output = Parameter;
fn index(&self, index: ParameterId) -> &Self::Output {
&self.parameters[index]
}
}
impl Index<ClassId> for Definitions {
type Output = Class;
fn index(&self, index: ClassId) -> &Self::Output {
&self.classes[index]
}
}
impl Index<AssignmentId> for Definitions {
type Output = Assignment;
fn index(&self, index: AssignmentId) -> &Self::Output {
&self.assignments[index]
}
}
impl Index<AnnotatedAssignmentId> for Definitions {
type Output = AnnotatedAssignment;
fn index(&self, index: AnnotatedAssignmentId) -> &Self::Output {
&self.annotated_assignments[index]
}
}
impl Index<TypeAliasId> for Definitions {
type Output = TypeAlias;
fn index(&self, index: TypeAliasId) -> &Self::Output {
&self.type_aliases[index]
}
}
impl Index<GlobalId> for Definitions {
type Output = Global;
fn index(&self, index: GlobalId) -> &Self::Output {
&self.globals[index]
}
}
impl Index<NonLocalId> for Definitions {
type Output = NonLocal;
fn index(&self, index: NonLocalId) -> &Self::Output {
&self.non_locals[index]
}
}
impl Index<TypeParameterId> for Definitions {
type Output = TypeParameter;
fn index(&self, index: TypeParameterId) -> &Self::Output {
&self.type_parameters[index]
}
}
struct DefinitionsVisitor<'a> {
definitions: Definitions,
ast_ids: &'a AstIds,
file_id: FileId,
}
impl DefinitionsVisitor<'_> {
fn ast_id<N: HasAstId>(&self, node: &N) -> HirAstId<N> {
HirAstId {
file_id: self.file_id,
node_id: self.ast_ids.ast_id(node),
}
}
fn lower_function_def(&mut self, function: &StmtFunctionDef) -> FunctionId {
let name = Name::new(&function.name);
let first_type_parameter_id = self.definitions.type_parameters.next_index();
let mut last_type_parameter_id = first_type_parameter_id;
if let Some(type_params) = &function.type_params {
for parameter in &type_params.type_params {
let id = self.lower_type_parameter(parameter);
last_type_parameter_id = id;
}
}
let parameters = self.lower_parameters(&function.parameters);
self.definitions.functions.push(Function {
name,
ast_id: self.ast_id(function),
parameters,
type_parameters: first_type_parameter_id..last_type_parameter_id,
})
}
fn lower_parameters(&mut self, parameters: &ruff_python_ast::Parameters) -> Range<ParameterId> {
let first_parameter_id = self.definitions.parameters.next_index();
let mut last_parameter_id = first_parameter_id;
for parameter in &parameters.posonlyargs {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::PositionalOnly,
name: Name::new(&parameter.parameter.name),
default: None,
ast_id: self.ast_id(&parameter.parameter),
});
}
if let Some(vararg) = &parameters.vararg {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::Vararg,
name: Name::new(&vararg.name),
default: None,
ast_id: self.ast_id(vararg),
});
}
for parameter in &parameters.kwonlyargs {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::KeywordOnly,
name: Name::new(&parameter.parameter.name),
default: None,
ast_id: self.ast_id(&parameter.parameter),
});
}
if let Some(kwarg) = &parameters.kwarg {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::KeywordOnly,
name: Name::new(&kwarg.name),
default: None,
ast_id: self.ast_id(kwarg),
});
}
first_parameter_id..last_parameter_id
}
fn lower_class_def(&mut self, class: &StmtClassDef) -> ClassId {
let name = Name::new(&class.name);
self.definitions.classes.push(Class {
name,
ast_id: self.ast_id(class),
})
}
fn lower_assignment(&mut self, assignment: &StmtAssign) {
// FIXME handle multiple names
if let Some(Expr::Name(name)) = assignment.targets.first() {
self.definitions.assignments.push(Assignment {
name: Name::new(&name.id),
ast_id: self.ast_id(assignment),
});
}
}
fn lower_annotated_assignment(&mut self, annotated_assignment: &StmtAnnAssign) {
if let Expr::Name(name) = &*annotated_assignment.target {
self.definitions
.annotated_assignments
.push(AnnotatedAssignment {
name: Name::new(&name.id),
ast_id: self.ast_id(annotated_assignment),
});
}
}
fn lower_type_alias(&mut self, type_alias: &StmtTypeAlias) {
if let Expr::Name(name) = &*type_alias.name {
let name = Name::new(&name.id);
let lower_parameters_id = self.definitions.type_parameters.next_index();
let mut last_parameter_id = lower_parameters_id;
if let Some(type_params) = &type_alias.type_params {
for type_parameter in &type_params.type_params {
let id = self.lower_type_parameter(type_parameter);
last_parameter_id = id;
}
}
self.definitions.type_aliases.push(TypeAlias {
name,
ast_id: self.ast_id(type_alias),
parameters: lower_parameters_id..last_parameter_id,
});
}
}
fn lower_type_parameter(&mut self, type_parameter: &TypeParam) -> TypeParameterId {
match type_parameter {
TypeParam::TypeVar(type_var) => {
self.definitions
.type_parameters
.push(TypeParameter::TypeVar(TypeParameterTypeVar {
name: Name::new(&type_var.name),
ast_id: self.ast_id(type_var),
}))
}
TypeParam::ParamSpec(param_spec) => {
self.definitions
.type_parameters
.push(TypeParameter::ParamSpec(TypeParameterParamSpec {
name: Name::new(&param_spec.name),
ast_id: self.ast_id(param_spec),
}))
}
TypeParam::TypeVarTuple(type_var_tuple) => {
self.definitions
.type_parameters
.push(TypeParameter::TypeVarTuple(TypeParameterTypeVarTuple {
name: Name::new(&type_var_tuple.name),
ast_id: self.ast_id(type_var_tuple),
}))
}
}
}
fn lower_import(&mut self, _import: &StmtImport) {
// TODO
}
fn lower_import_from(&mut self, _import_from: &StmtImportFrom) {
// TODO
}
fn lower_global(&mut self, global: &StmtGlobal) -> GlobalId {
self.definitions.globals.push(Global {
ast_id: self.ast_id(global),
})
}
fn lower_non_local(&mut self, non_local: &StmtNonlocal) -> NonLocalId {
self.definitions.non_locals.push(NonLocal {
ast_id: self.ast_id(non_local),
})
}
fn lower_except_handler(&mut self, _except_handler: &ExceptHandlerExceptHandler) {
// TODO
}
fn lower_with_item(&mut self, _with_item: &WithItem) {
// TODO
}
fn lower_match_case(&mut self, _match_case: &MatchCase) {
// TODO
}
}
impl PreorderVisitor<'_> for DefinitionsVisitor<'_> {
fn visit_stmt(&mut self, stmt: &Stmt) {
match stmt {
// Definition statements
Stmt::FunctionDef(definition) => {
self.lower_function_def(definition);
self.visit_body(&definition.body);
}
Stmt::ClassDef(definition) => {
self.lower_class_def(definition);
self.visit_body(&definition.body);
}
Stmt::Assign(assignment) => {
self.lower_assignment(assignment);
}
Stmt::AnnAssign(annotated_assignment) => {
self.lower_annotated_assignment(annotated_assignment);
}
Stmt::TypeAlias(type_alias) => {
self.lower_type_alias(type_alias);
}
Stmt::Import(import) => self.lower_import(import),
Stmt::ImportFrom(import_from) => self.lower_import_from(import_from),
Stmt::Global(global) => {
self.lower_global(global);
}
Stmt::Nonlocal(non_local) => {
self.lower_non_local(non_local);
}
// Visit the compound statement bodies because they can contain other definitions.
Stmt::For(_)
| Stmt::While(_)
| Stmt::If(_)
| Stmt::With(_)
| Stmt::Match(_)
| Stmt::Try(_) => {
preorder::walk_stmt(self, stmt);
}
// Skip over simple statements because they can't contain any other definitions.
Stmt::Return(_)
| Stmt::Delete(_)
| Stmt::AugAssign(_)
| Stmt::Raise(_)
| Stmt::Assert(_)
| Stmt::Expr(_)
| Stmt::Pass(_)
| Stmt::Break(_)
| Stmt::Continue(_)
| Stmt::IpyEscapeCommand(_) => {
// No op
}
}
}
fn visit_expr(&mut self, _: &'_ Expr) {}
fn visit_decorator(&mut self, _decorator: &'_ Decorator) {}
fn visit_except_handler(&mut self, except_handler: &'_ ExceptHandler) {
match except_handler {
ExceptHandler::ExceptHandler(except_handler) => {
self.lower_except_handler(except_handler);
}
}
}
fn visit_with_item(&mut self, with_item: &'_ WithItem) {
self.lower_with_item(with_item);
}
fn visit_match_case(&mut self, match_case: &'_ MatchCase) {
self.lower_match_case(match_case);
self.visit_body(&match_case.body);
}
}

110
crates/red_knot/src/lib.rs Normal file
View File

@@ -0,0 +1,110 @@
use std::fmt::Formatter;
use std::hash::BuildHasherDefault;
use std::ops::Deref;
use std::path::{Path, PathBuf};
use rustc_hash::{FxHashSet, FxHasher};
use crate::files::FileId;
pub mod ast_ids;
pub mod cache;
pub mod cancellation;
pub mod db;
pub mod files;
mod format;
pub mod hir;
pub mod lint;
pub mod module;
mod parse;
pub mod program;
pub mod source;
mod symbols;
mod types;
pub mod watch;
pub(crate) type FxDashMap<K, V> = dashmap::DashMap<K, V, BuildHasherDefault<FxHasher>>;
#[allow(unused)]
pub(crate) type FxDashSet<V> = dashmap::DashSet<V, BuildHasherDefault<FxHasher>>;
pub(crate) type FxIndexSet<V> = indexmap::set::IndexSet<V, BuildHasherDefault<FxHasher>>;
#[derive(Debug, Clone)]
pub struct Workspace {
/// TODO this should be a resolved path. We should probably use a newtype wrapper that guarantees that
/// PATH is a UTF-8 path and is normalized.
root: PathBuf,
/// The files that are open in the workspace.
///
/// * Editor: The files that are actively being edited in the editor (the user has a tab open with the file).
/// * CLI: The resolved files passed as arguments to the CLI.
open_files: FxHashSet<FileId>,
}
impl Workspace {
pub fn new(root: PathBuf) -> Self {
Self {
root,
open_files: FxHashSet::default(),
}
}
pub fn root(&self) -> &Path {
self.root.as_path()
}
// TODO having the content in workspace feels wrong.
pub fn open_file(&mut self, file_id: FileId) {
self.open_files.insert(file_id);
}
pub fn close_file(&mut self, file_id: FileId) {
self.open_files.remove(&file_id);
}
// TODO introduce an `OpenFile` type instead of using an anonymous tuple.
pub fn open_files(&self) -> impl Iterator<Item = FileId> + '_ {
self.open_files.iter().copied()
}
pub fn is_file_open(&self, file_id: FileId) -> bool {
self.open_files.contains(&file_id)
}
}
#[derive(Debug, Clone, Eq, PartialEq, Hash)]
pub struct Name(smol_str::SmolStr);
impl Name {
#[inline]
pub fn new(name: &str) -> Self {
Self(smol_str::SmolStr::new(name))
}
pub fn as_str(&self) -> &str {
self.0.as_str()
}
}
impl Deref for Name {
type Target = str;
#[inline]
fn deref(&self) -> &Self::Target {
self.as_str()
}
}
impl<T> From<T> for Name
where
T: Into<smol_str::SmolStr>,
{
fn from(value: T) -> Self {
Self(value.into())
}
}
impl std::fmt::Display for Name {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.write_str(self.as_str())
}
}

264
crates/red_knot/src/lint.rs Normal file
View File

@@ -0,0 +1,264 @@
use std::cell::RefCell;
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use std::time::Duration;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{ModModule, StringLiteral};
use crate::cache::KeyValueCache;
use crate::db::{HasJar, LintDb, LintJar, QueryResult, SemanticDb};
use crate::files::FileId;
use crate::parse::Parsed;
use crate::source::Source;
use crate::symbols::{Definition, SymbolId, SymbolTable};
use crate::types::Type;
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn lint_syntax<Db>(db: &Db, file_id: FileId) -> QueryResult<Diagnostics>
where
Db: LintDb + HasJar<LintJar>,
{
let storage = &db.jar()?.lint_syntax;
#[allow(clippy::print_stdout)]
if std::env::var("RED_KNOT_SLOW_LINT").is_ok() {
for i in 0..10 {
db.cancelled()?;
println!("RED_KNOT_SLOW_LINT is set, sleeping for {i}/10 seconds");
std::thread::sleep(Duration::from_secs(1));
}
}
storage.get(&file_id, |file_id| {
let mut diagnostics = Vec::new();
let source = db.source(*file_id)?;
lint_lines(source.text(), &mut diagnostics);
let parsed = db.parse(*file_id)?;
if parsed.errors().is_empty() {
let ast = parsed.ast();
let mut visitor = SyntaxLintVisitor {
diagnostics,
source: source.text(),
};
visitor.visit_body(&ast.body);
diagnostics = visitor.diagnostics;
} else {
diagnostics.extend(parsed.errors().iter().map(std::string::ToString::to_string));
}
Ok(Diagnostics::from(diagnostics))
})
}
fn lint_lines(source: &str, diagnostics: &mut Vec<String>) {
for (line_number, line) in source.lines().enumerate() {
if line.len() < 88 {
continue;
}
let char_count = line.chars().count();
if char_count > 88 {
diagnostics.push(format!(
"Line {} is too long ({} characters)",
line_number + 1,
char_count
));
}
}
}
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn lint_semantic<Db>(db: &Db, file_id: FileId) -> QueryResult<Diagnostics>
where
Db: LintDb + HasJar<LintJar>,
{
let storage = &db.jar()?.lint_semantic;
storage.get(&file_id, |file_id| {
let source = db.source(*file_id)?;
let parsed = db.parse(*file_id)?;
let symbols = db.symbol_table(*file_id)?;
let context = SemanticLintContext {
file_id: *file_id,
source,
parsed,
symbols,
db,
diagnostics: RefCell::new(Vec::new()),
};
lint_unresolved_imports(&context)?;
Ok(Diagnostics::from(context.diagnostics.take()))
})
}
fn lint_unresolved_imports(context: &SemanticLintContext) -> QueryResult<()> {
// TODO: Consider iterating over the dependencies (imports) only instead of all definitions.
for (symbol, definition) in context.symbols().all_definitions() {
match definition {
Definition::Import(import) => {
let ty = context.infer_symbol_type(symbol)?;
if ty.is_unknown() {
context.push_diagnostic(format!("Unresolved module {}", import.module));
}
}
Definition::ImportFrom(import) => {
let ty = context.infer_symbol_type(symbol)?;
if ty.is_unknown() {
let module_name = import.module().map(Deref::deref).unwrap_or_default();
let message = if import.level() > 0 {
format!(
"Unresolved relative import '{}' from {}{}",
import.name(),
".".repeat(import.level() as usize),
module_name
)
} else {
format!(
"Unresolved import '{}' from '{}'",
import.name(),
module_name
)
};
context.push_diagnostic(message);
}
}
_ => {}
}
}
Ok(())
}
pub struct SemanticLintContext<'a> {
file_id: FileId,
source: Source,
parsed: Parsed,
symbols: Arc<SymbolTable>,
db: &'a dyn SemanticDb,
diagnostics: RefCell<Vec<String>>,
}
impl<'a> SemanticLintContext<'a> {
pub fn source_text(&self) -> &str {
self.source.text()
}
pub fn file_id(&self) -> FileId {
self.file_id
}
pub fn ast(&self) -> &ModModule {
self.parsed.ast()
}
pub fn symbols(&self) -> &SymbolTable {
&self.symbols
}
pub fn infer_symbol_type(&self, symbol_id: SymbolId) -> QueryResult<Type> {
self.db.infer_symbol_type(self.file_id, symbol_id)
}
pub fn push_diagnostic(&self, diagnostic: String) {
self.diagnostics.borrow_mut().push(diagnostic);
}
pub fn extend_diagnostics(&mut self, diagnostics: impl IntoIterator<Item = String>) {
self.diagnostics.get_mut().extend(diagnostics);
}
}
#[derive(Debug)]
struct SyntaxLintVisitor<'a> {
diagnostics: Vec<String>,
source: &'a str,
}
impl Visitor<'_> for SyntaxLintVisitor<'_> {
fn visit_string_literal(&mut self, string_literal: &'_ StringLiteral) {
// A very naive implementation of use double quotes
let text = &self.source[string_literal.range];
if text.starts_with('\'') {
self.diagnostics
.push("Use double quotes for strings".to_string());
}
}
}
#[derive(Debug, Clone)]
pub enum Diagnostics {
Empty,
List(Arc<Vec<String>>),
}
impl Diagnostics {
pub fn as_slice(&self) -> &[String] {
match self {
Diagnostics::Empty => &[],
Diagnostics::List(list) => list.as_slice(),
}
}
}
impl Deref for Diagnostics {
type Target = [String];
fn deref(&self) -> &Self::Target {
self.as_slice()
}
}
impl From<Vec<String>> for Diagnostics {
fn from(value: Vec<String>) -> Self {
if value.is_empty() {
Diagnostics::Empty
} else {
Diagnostics::List(Arc::new(value))
}
}
}
#[derive(Default, Debug)]
pub struct LintSyntaxStorage(KeyValueCache<FileId, Diagnostics>);
impl Deref for LintSyntaxStorage {
type Target = KeyValueCache<FileId, Diagnostics>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for LintSyntaxStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
#[derive(Default, Debug)]
pub struct LintSemanticStorage(KeyValueCache<FileId, Diagnostics>);
impl Deref for LintSemanticStorage {
type Target = KeyValueCache<FileId, Diagnostics>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for LintSemanticStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}

446
crates/red_knot/src/main.rs Normal file
View File

@@ -0,0 +1,446 @@
#![allow(clippy::dbg_macro)]
use std::collections::hash_map::Entry;
use std::path::Path;
use std::sync::Mutex;
use crossbeam::channel as crossbeam_channel;
use rustc_hash::FxHashMap;
use tracing::subscriber::Interest;
use tracing::{Level, Metadata};
use tracing_subscriber::filter::LevelFilter;
use tracing_subscriber::layer::{Context, Filter, SubscriberExt};
use tracing_subscriber::{Layer, Registry};
use tracing_tree::time::Uptime;
use red_knot::db::{
Database, HasJar, ParallelDatabase, QueryError, SemanticDb, SourceDb, SourceJar,
};
use red_knot::files::FileId;
use red_knot::module::{ModuleSearchPath, ModuleSearchPathKind};
use red_knot::program::check::ExecutionMode;
use red_knot::program::{FileChange, FileChangeKind, Program};
use red_knot::watch::FileWatcher;
use red_knot::Workspace;
#[allow(clippy::print_stdout, clippy::unnecessary_wraps, clippy::print_stderr)]
fn main() -> anyhow::Result<()> {
setup_tracing();
let arguments: Vec<_> = std::env::args().collect();
if arguments.len() < 2 {
eprintln!("Usage: red_knot <path>");
return Err(anyhow::anyhow!("Invalid arguments"));
}
let entry_point = Path::new(&arguments[1]);
if !entry_point.exists() {
eprintln!("The entry point does not exist.");
return Err(anyhow::anyhow!("Invalid arguments"));
}
if !entry_point.is_file() {
eprintln!("The entry point is not a file.");
return Err(anyhow::anyhow!("Invalid arguments"));
}
let workspace_folder = entry_point.parent().unwrap();
let workspace = Workspace::new(workspace_folder.to_path_buf());
let workspace_search_path = ModuleSearchPath::new(
workspace.root().to_path_buf(),
ModuleSearchPathKind::FirstParty,
);
let mut program = Program::new(workspace);
program.set_module_search_paths(vec![workspace_search_path]);
let entry_id = program.file_id(entry_point);
program.workspace_mut().open_file(entry_id);
let (main_loop, main_loop_cancellation_token) = MainLoop::new();
// Listen to Ctrl+C and abort the watch mode.
let main_loop_cancellation_token = Mutex::new(Some(main_loop_cancellation_token));
ctrlc::set_handler(move || {
let mut lock = main_loop_cancellation_token.lock().unwrap();
if let Some(token) = lock.take() {
token.stop();
}
})?;
let file_changes_notifier = main_loop.file_changes_notifier();
// Watch for file changes and re-trigger the analysis.
let mut file_watcher = FileWatcher::new(
move |changes| {
file_changes_notifier.notify(changes);
},
program.files().clone(),
)?;
file_watcher.watch_folder(workspace_folder)?;
main_loop.run(&mut program);
let source_jar: &SourceJar = program.jar().unwrap();
dbg!(source_jar.parsed.statistics());
dbg!(source_jar.sources.statistics());
Ok(())
}
struct MainLoop {
orchestrator_sender: crossbeam_channel::Sender<OrchestratorMessage>,
main_loop_receiver: crossbeam_channel::Receiver<MainLoopMessage>,
}
impl MainLoop {
fn new() -> (Self, MainLoopCancellationToken) {
let (orchestrator_sender, orchestrator_receiver) = crossbeam_channel::bounded(1);
let (main_loop_sender, main_loop_receiver) = crossbeam_channel::bounded(1);
let mut orchestrator = Orchestrator {
receiver: orchestrator_receiver,
sender: main_loop_sender.clone(),
revision: 0,
};
std::thread::spawn(move || {
orchestrator.run();
});
(
Self {
orchestrator_sender,
main_loop_receiver,
},
MainLoopCancellationToken {
sender: main_loop_sender,
},
)
}
fn file_changes_notifier(&self) -> FileChangesNotifier {
FileChangesNotifier {
sender: self.orchestrator_sender.clone(),
}
}
fn run(self, program: &mut Program) {
self.orchestrator_sender
.send(OrchestratorMessage::Run)
.unwrap();
for message in &self.main_loop_receiver {
tracing::trace!("Main Loop: Tick");
match message {
MainLoopMessage::CheckProgram { revision } => {
{
let program = program.snapshot();
let sender = self.orchestrator_sender.clone();
// Spawn a new task that checks the program. This needs to be done in a separate thread
// to prevent blocking the main loop here.
rayon::spawn(move || match program.check(ExecutionMode::ThreadPool) {
Ok(result) => {
sender
.send(OrchestratorMessage::CheckProgramCompleted {
diagnostics: result,
revision,
})
.unwrap();
}
Err(QueryError::Cancelled) => {}
});
}
if !program.is_cancelled() {
let _ = program.format();
}
}
MainLoopMessage::ApplyChanges(changes) => {
// Automatically cancels any pending queries and waits for them to complete.
program.apply_changes(changes.iter());
}
MainLoopMessage::CheckCompleted(diagnostics) => {
dbg!(diagnostics);
}
MainLoopMessage::Exit => {
return;
}
}
}
}
}
impl Drop for MainLoop {
fn drop(&mut self) {
self.orchestrator_sender
.send(OrchestratorMessage::Shutdown)
.unwrap();
}
}
#[derive(Debug, Clone)]
struct FileChangesNotifier {
sender: crossbeam_channel::Sender<OrchestratorMessage>,
}
impl FileChangesNotifier {
fn notify(&self, changes: Vec<FileChange>) {
self.sender
.send(OrchestratorMessage::FileChanges(changes))
.unwrap();
}
}
#[derive(Debug)]
struct MainLoopCancellationToken {
sender: crossbeam_channel::Sender<MainLoopMessage>,
}
impl MainLoopCancellationToken {
fn stop(self) {
self.sender.send(MainLoopMessage::Exit).unwrap();
}
}
struct Orchestrator {
/// Sends messages to the main loop.
sender: crossbeam_channel::Sender<MainLoopMessage>,
/// Receives messages from the main loop.
receiver: crossbeam_channel::Receiver<OrchestratorMessage>,
revision: usize,
}
impl Orchestrator {
fn run(&mut self) {
while let Ok(message) = self.receiver.recv() {
match message {
OrchestratorMessage::Run => {
self.sender
.send(MainLoopMessage::CheckProgram {
revision: self.revision,
})
.unwrap();
}
OrchestratorMessage::CheckProgramCompleted {
diagnostics,
revision,
} => {
// Only take the diagnostics if they are for the latest revision.
if self.revision == revision {
self.sender
.send(MainLoopMessage::CheckCompleted(diagnostics))
.unwrap();
} else {
tracing::debug!("Discarding diagnostics for outdated revision {revision} (current: {}).", self.revision);
}
}
OrchestratorMessage::FileChanges(changes) => {
// Request cancellation, but wait until all analysis tasks have completed to
// avoid stale messages in the next main loop.
self.revision += 1;
self.debounce_changes(changes);
}
OrchestratorMessage::Shutdown => {
return self.shutdown();
}
}
}
}
fn debounce_changes(&self, changes: Vec<FileChange>) {
let mut aggregated_changes = AggregatedChanges::default();
aggregated_changes.extend(changes);
loop {
// Consume possibly incoming file change messages before running a new analysis, but don't wait for more than 100ms.
crossbeam_channel::select! {
recv(self.receiver) -> message => {
match message {
Ok(OrchestratorMessage::Shutdown) => {
return self.shutdown();
}
Ok(OrchestratorMessage::FileChanges(file_changes)) => {
aggregated_changes.extend(file_changes);
}
Ok(OrchestratorMessage::CheckProgramCompleted { .. })=> {
// disregard any outdated completion message.
}
Ok(OrchestratorMessage::Run) => unreachable!("The orchestrator is already running."),
Err(_) => {
// There are no more senders, no point in waiting for more messages
return;
}
}
},
default(std::time::Duration::from_millis(10)) => {
// No more file changes after 10 ms, send the changes and schedule a new analysis
self.sender.send(MainLoopMessage::ApplyChanges(aggregated_changes)).unwrap();
self.sender.send(MainLoopMessage::CheckProgram { revision: self.revision}).unwrap();
return;
}
}
}
}
#[allow(clippy::unused_self)]
fn shutdown(&self) {
tracing::trace!("Shutting down orchestrator.");
}
}
/// Message sent from the orchestrator to the main loop.
#[derive(Debug)]
enum MainLoopMessage {
CheckProgram { revision: usize },
CheckCompleted(Vec<String>),
ApplyChanges(AggregatedChanges),
Exit,
}
#[derive(Debug)]
enum OrchestratorMessage {
Run,
Shutdown,
CheckProgramCompleted {
diagnostics: Vec<String>,
revision: usize,
},
FileChanges(Vec<FileChange>),
}
#[derive(Default, Debug)]
struct AggregatedChanges {
changes: FxHashMap<FileId, FileChangeKind>,
}
impl AggregatedChanges {
fn add(&mut self, change: FileChange) {
match self.changes.entry(change.file_id()) {
Entry::Occupied(mut entry) => {
let merged = entry.get_mut();
match (merged, change.kind()) {
(FileChangeKind::Created, FileChangeKind::Deleted) => {
// Deletion after creations means that ruff never saw the file.
entry.remove();
}
(FileChangeKind::Created, FileChangeKind::Modified) => {
// No-op, for ruff, modifying a file that it doesn't yet know that it exists is still considered a creation.
}
(FileChangeKind::Modified, FileChangeKind::Created) => {
// Uhh, that should probably not happen. Continue considering it a modification.
}
(FileChangeKind::Modified, FileChangeKind::Deleted) => {
*entry.get_mut() = FileChangeKind::Deleted;
}
(FileChangeKind::Deleted, FileChangeKind::Created) => {
*entry.get_mut() = FileChangeKind::Modified;
}
(FileChangeKind::Deleted, FileChangeKind::Modified) => {
// That's weird, but let's consider it a modification.
*entry.get_mut() = FileChangeKind::Modified;
}
(FileChangeKind::Created, FileChangeKind::Created)
| (FileChangeKind::Modified, FileChangeKind::Modified)
| (FileChangeKind::Deleted, FileChangeKind::Deleted) => {
// No-op transitions. Some of them should be impossible but we handle them anyway.
}
}
}
Entry::Vacant(entry) => {
entry.insert(change.kind());
}
}
}
fn extend<I>(&mut self, changes: I)
where
I: IntoIterator<Item = FileChange>,
I::IntoIter: ExactSizeIterator,
{
let iter = changes.into_iter();
self.changes.reserve(iter.len());
for change in iter {
self.add(change);
}
}
fn iter(&self) -> impl Iterator<Item = FileChange> + '_ {
self.changes
.iter()
.map(|(id, kind)| FileChange::new(*id, *kind))
}
}
fn setup_tracing() {
let subscriber = Registry::default().with(
tracing_tree::HierarchicalLayer::default()
.with_indent_lines(true)
.with_indent_amount(2)
.with_bracketed_fields(true)
.with_thread_ids(true)
.with_targets(true)
.with_writer(|| Box::new(std::io::stderr()))
.with_timer(Uptime::default())
.with_filter(LoggingFilter {
trace_level: Level::TRACE,
}),
);
tracing::subscriber::set_global_default(subscriber).unwrap();
}
struct LoggingFilter {
trace_level: Level,
}
impl LoggingFilter {
fn is_enabled(&self, meta: &Metadata<'_>) -> bool {
let filter = if meta.target().starts_with("red_knot") || meta.target().starts_with("ruff") {
self.trace_level
} else {
Level::INFO
};
meta.level() <= &filter
}
}
impl<S> Filter<S> for LoggingFilter {
fn enabled(&self, meta: &Metadata<'_>, _cx: &Context<'_, S>) -> bool {
self.is_enabled(meta)
}
fn callsite_enabled(&self, meta: &'static Metadata<'static>) -> Interest {
if self.is_enabled(meta) {
Interest::always()
} else {
Interest::never()
}
}
fn max_level_hint(&self) -> Option<LevelFilter> {
Some(LevelFilter::from_level(self.trace_level))
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,95 @@
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use ruff_python_ast as ast;
use ruff_python_parser::{Mode, ParseError};
use ruff_text_size::{Ranged, TextRange};
use crate::cache::KeyValueCache;
use crate::db::{HasJar, QueryResult, SourceDb, SourceJar};
use crate::files::FileId;
#[derive(Debug, Clone, PartialEq)]
pub struct Parsed {
inner: Arc<ParsedInner>,
}
#[derive(Debug, PartialEq)]
struct ParsedInner {
ast: ast::ModModule,
errors: Vec<ParseError>,
}
impl Parsed {
fn new(ast: ast::ModModule, errors: Vec<ParseError>) -> Self {
Self {
inner: Arc::new(ParsedInner { ast, errors }),
}
}
pub(crate) fn from_text(text: &str) -> Self {
let result = ruff_python_parser::parse(text, Mode::Module);
let (module, errors) = match result {
Ok(ast::Mod::Module(module)) => (module, vec![]),
Ok(ast::Mod::Expression(expression)) => (
ast::ModModule {
range: expression.range(),
body: vec![ast::Stmt::Expr(ast::StmtExpr {
range: expression.range(),
value: expression.body,
})],
},
vec![],
),
Err(errors) => (
ast::ModModule {
range: TextRange::default(),
body: Vec::new(),
},
vec![errors],
),
};
Parsed::new(module, errors)
}
pub fn ast(&self) -> &ast::ModModule {
&self.inner.ast
}
pub fn errors(&self) -> &[ParseError] {
&self.inner.errors
}
}
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn parse<Db>(db: &Db, file_id: FileId) -> QueryResult<Parsed>
where
Db: SourceDb + HasJar<SourceJar>,
{
let parsed = db.jar()?;
parsed.parsed.get(&file_id, |file_id| {
let source = db.source(*file_id)?;
Ok(Parsed::from_text(source.text()))
})
}
#[derive(Debug, Default)]
pub struct ParsedStorage(KeyValueCache<FileId, Parsed>);
impl Deref for ParsedStorage {
type Target = KeyValueCache<FileId, Parsed>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for ParsedStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}

View File

@@ -0,0 +1,425 @@
use rayon::{current_num_threads, yield_local};
use rustc_hash::FxHashSet;
use crate::db::{Database, LintDb, QueryError, QueryResult, SemanticDb};
use crate::files::FileId;
use crate::format::{FormatDb, FormatError};
use crate::lint::Diagnostics;
use crate::program::Program;
use crate::symbols::Dependency;
impl Program {
/// Checks all open files in the workspace and its dependencies.
#[tracing::instrument(level = "debug", skip_all)]
pub fn check(&self, mode: ExecutionMode) -> QueryResult<Vec<String>> {
self.cancelled()?;
let mut context = CheckContext::new(self);
match mode {
ExecutionMode::SingleThreaded => SingleThreadedExecutor.run(&mut context)?,
ExecutionMode::ThreadPool => ThreadPoolExecutor.run(&mut context)?,
};
Ok(context.finish())
}
#[tracing::instrument(level = "debug", skip(self, context))]
fn check_file(&self, file: FileId, context: &CheckFileContext) -> QueryResult<Diagnostics> {
self.cancelled()?;
let symbol_table = self.symbol_table(file)?;
let dependencies = symbol_table.dependencies();
if !dependencies.is_empty() {
let module = self.file_to_module(file)?;
// TODO scheduling all dependencies here is wasteful if we don't infer any types on them
// but I think that's unlikely, so it is okay?
// Anyway, we need to figure out a way to retrieve the dependencies of a module
// from the persistent cache. So maybe it should be a separate query after all.
for dependency in dependencies {
let dependency_name = match dependency {
Dependency::Module(name) => Some(name.clone()),
Dependency::Relative { .. } => match &module {
Some(module) => module.resolve_dependency(self, dependency)?,
None => None,
},
};
if let Some(dependency_name) = dependency_name {
// TODO We may want to have a different check functions for non-first-party
// files because we only need to index them and not check them.
// Supporting non-first-party code also requires supporting typing stubs.
if let Some(dependency) = self.resolve_module(dependency_name)? {
if dependency.path(self)?.root().kind().is_first_party() {
context.schedule_dependency(dependency.path(self)?.file());
}
}
}
}
}
let mut diagnostics = Vec::new();
if self.workspace().is_file_open(file) {
diagnostics.extend_from_slice(&self.lint_syntax(file)?);
diagnostics.extend_from_slice(&self.lint_semantic(file)?);
match self.check_file_formatted(file) {
Ok(format_diagnostics) => {
diagnostics.extend_from_slice(&format_diagnostics);
}
Err(FormatError::Query(err)) => {
return Err(err);
}
Err(FormatError::Format(error)) => {
diagnostics.push(format!("Error formatting file: {error}"));
}
}
}
Ok(Diagnostics::from(diagnostics))
}
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum ExecutionMode {
SingleThreaded,
ThreadPool,
}
/// Context that stores state information about the entire check operation.
struct CheckContext<'a> {
/// IDs of the files that have been queued for checking.
///
/// Used to avoid queuing the same file twice.
scheduled_files: FxHashSet<FileId>,
/// Reference to the program that is checked.
program: &'a Program,
/// The aggregated diagnostics
diagnostics: Vec<String>,
}
impl<'a> CheckContext<'a> {
fn new(program: &'a Program) -> Self {
Self {
scheduled_files: FxHashSet::default(),
program,
diagnostics: Vec::new(),
}
}
/// Returns the tasks to check all open files in the workspace.
fn check_open_files(&mut self) -> Vec<CheckOpenFileTask> {
self.scheduled_files
.extend(self.program.workspace().open_files());
self.program
.workspace()
.open_files()
.map(|file_id| CheckOpenFileTask { file_id })
.collect()
}
/// Returns the task to check a dependency.
fn check_dependency(&mut self, file_id: FileId) -> Option<CheckDependencyTask> {
if self.scheduled_files.insert(file_id) {
Some(CheckDependencyTask { file_id })
} else {
None
}
}
/// Pushes the result for a single file check operation
fn push_diagnostics(&mut self, diagnostics: &Diagnostics) {
self.diagnostics.extend_from_slice(diagnostics);
}
/// Returns a reference to the program that is being checked.
fn program(&self) -> &'a Program {
self.program
}
/// Creates a task context that is used to check a single file.
fn task_context<'b, S>(&self, dependency_scheduler: &'b S) -> CheckTaskContext<'a, 'b, S>
where
S: ScheduleDependency,
{
CheckTaskContext {
program: self.program,
dependency_scheduler,
}
}
fn finish(self) -> Vec<String> {
self.diagnostics
}
}
/// Trait that abstracts away how a dependency of a file gets scheduled for checking.
trait ScheduleDependency {
/// Schedules the file with the given ID for checking.
fn schedule(&self, file_id: FileId);
}
impl<T> ScheduleDependency for T
where
T: Fn(FileId),
{
fn schedule(&self, file_id: FileId) {
let f = self;
f(file_id);
}
}
/// Context that is used to run a single file check task.
///
/// The task is generic over `S` because it is passed across thread boundaries and
/// we don't want to add the requirement that [`ScheduleDependency`] must be [`Send`].
struct CheckTaskContext<'a, 'scheduler, S>
where
S: ScheduleDependency,
{
dependency_scheduler: &'scheduler S,
program: &'a Program,
}
impl<'a, 'scheduler, S> CheckTaskContext<'a, 'scheduler, S>
where
S: ScheduleDependency,
{
fn as_file_context(&self) -> CheckFileContext<'scheduler> {
CheckFileContext {
dependency_scheduler: self.dependency_scheduler,
}
}
}
/// Context passed when checking a single file.
///
/// This is a trimmed down version of [`CheckTaskContext`] with the type parameter `S` erased
/// to avoid monomorphization of [`Program:check_file`].
struct CheckFileContext<'a> {
dependency_scheduler: &'a dyn ScheduleDependency,
}
impl<'a> CheckFileContext<'a> {
fn schedule_dependency(&self, file_id: FileId) {
self.dependency_scheduler.schedule(file_id);
}
}
#[derive(Debug)]
enum CheckFileTask {
OpenFile(CheckOpenFileTask),
Dependency(CheckDependencyTask),
}
impl CheckFileTask {
/// Runs the task and returns the results for checking this file.
fn run<S>(&self, context: &CheckTaskContext<S>) -> QueryResult<Diagnostics>
where
S: ScheduleDependency,
{
match self {
Self::OpenFile(task) => task.run(context),
Self::Dependency(task) => task.run(context),
}
}
fn file_id(&self) -> FileId {
match self {
CheckFileTask::OpenFile(task) => task.file_id,
CheckFileTask::Dependency(task) => task.file_id,
}
}
}
/// Task to check an open file.
#[derive(Debug)]
struct CheckOpenFileTask {
file_id: FileId,
}
impl CheckOpenFileTask {
fn run<S>(&self, context: &CheckTaskContext<S>) -> QueryResult<Diagnostics>
where
S: ScheduleDependency,
{
context
.program
.check_file(self.file_id, &context.as_file_context())
}
}
/// Task to check a dependency file.
#[derive(Debug)]
struct CheckDependencyTask {
file_id: FileId,
}
impl CheckDependencyTask {
fn run<S>(&self, context: &CheckTaskContext<S>) -> QueryResult<Diagnostics>
where
S: ScheduleDependency,
{
context
.program
.check_file(self.file_id, &context.as_file_context())
}
}
/// Executor that schedules the checking of individual program files.
trait CheckExecutor {
fn run(self, context: &mut CheckContext) -> QueryResult<()>;
}
/// Executor that runs all check operations on the current thread.
///
/// The executor does not schedule dependencies for checking.
/// The main motivation for scheduling dependencies
/// in a multithreaded environment is to parse and index the dependencies concurrently.
/// However, that doesn't make sense in a single threaded environment, because the dependencies then compute
/// with checking the open files. Checking dependencies in a single threaded environment is more likely
/// to hurt performance because we end up analyzing files in their entirety, even if we only need to type check parts of them.
#[derive(Debug, Default)]
struct SingleThreadedExecutor;
impl CheckExecutor for SingleThreadedExecutor {
fn run(self, context: &mut CheckContext) -> QueryResult<()> {
let mut queue = context.check_open_files();
let noop_schedule_dependency = |_| {};
while let Some(file) = queue.pop() {
context.program().cancelled()?;
let task_context = context.task_context(&noop_schedule_dependency);
context.push_diagnostics(&file.run(&task_context)?);
}
Ok(())
}
}
/// Executor that runs the check operations on a thread pool.
///
/// The executor runs each check operation as its own task using a thread pool.
///
/// Other than [`SingleThreadedExecutor`], this executor schedules dependencies for checking. It
/// even schedules dependencies for checking when the thread pool size is 1 for a better debugging experience.
#[derive(Debug, Default)]
struct ThreadPoolExecutor;
impl CheckExecutor for ThreadPoolExecutor {
fn run(self, context: &mut CheckContext) -> QueryResult<()> {
let num_threads = current_num_threads();
let single_threaded = num_threads == 1;
let span = tracing::trace_span!("ThreadPoolExecutor::run", num_threads);
let _ = span.enter();
let mut queue: Vec<_> = context
.check_open_files()
.into_iter()
.map(CheckFileTask::OpenFile)
.collect();
let (sender, receiver) = if single_threaded {
// Use an unbounded queue for single threaded execution to prevent deadlocks
// when a single file schedules multiple dependencies.
crossbeam::channel::unbounded()
} else {
// Use a bounded queue to apply backpressure when the orchestration thread isn't able to keep
// up processing messages from the worker threads.
crossbeam::channel::bounded(num_threads)
};
let schedule_sender = sender.clone();
let schedule_dependency = move |file_id| {
schedule_sender
.send(ThreadPoolMessage::ScheduleDependency(file_id))
.unwrap();
};
let result = rayon::in_place_scope(|scope| {
let mut pending = 0usize;
loop {
context.program().cancelled()?;
// 1. Try to get a queued message to ensure that we have always remaining space in the channel to prevent blocking the worker threads.
// 2. Try to process a queued file
// 3. If there's no queued file wait for the next incoming message.
// 4. Exit if there are no more messages and no senders.
let message = if let Ok(message) = receiver.try_recv() {
message
} else if let Some(task) = queue.pop() {
pending += 1;
let task_context = context.task_context(&schedule_dependency);
let sender = sender.clone();
let task_span = tracing::trace_span!(
parent: &span,
"CheckFileTask::run",
file_id = task.file_id().as_u32(),
);
scope.spawn(move |_| {
task_span.in_scope(|| match task.run(&task_context) {
Ok(result) => {
sender.send(ThreadPoolMessage::Completed(result)).unwrap();
}
Err(err) => sender.send(ThreadPoolMessage::Errored(err)).unwrap(),
});
});
// If this is a single threaded rayon thread pool, yield the current thread
// or we never start processing the work items.
if single_threaded {
yield_local();
}
continue;
} else if let Ok(message) = receiver.recv() {
message
} else {
break;
};
match message {
ThreadPoolMessage::ScheduleDependency(dependency) => {
if let Some(task) = context.check_dependency(dependency) {
queue.push(CheckFileTask::Dependency(task));
}
}
ThreadPoolMessage::Completed(diagnostics) => {
context.push_diagnostics(&diagnostics);
pending -= 1;
if pending == 0 && queue.is_empty() {
break;
}
}
ThreadPoolMessage::Errored(err) => {
return Err(err);
}
}
}
Ok(())
});
result
}
}
#[derive(Debug)]
enum ThreadPoolMessage {
ScheduleDependency(FileId),
Completed(Diagnostics),
Errored(QueryError),
}

View File

@@ -0,0 +1,44 @@
use crate::db::{QueryResult, SourceDb};
use crate::format::{FormatDb, FormatError, FormattedFile};
use crate::program::Program;
impl Program {
#[tracing::instrument(level = "trace", skip(self))]
pub fn format(&mut self) -> QueryResult<()> {
// Formats all open files
// TODO make `Executor` from `check` reusable.
for file in self.workspace.open_files() {
match self.format_file(file) {
Ok(FormattedFile::Formatted(content)) => {
let path = self.file_path(file);
// TODO: This is problematic because it immediately re-triggers the file watcher.
// A possible solution is to track the self "inflicted" changes inside of programs
// by tracking the file revision right after the write. It could then use the revision
// to determine which changes are safe to ignore (and in which context).
// An other alternative is to not write as part of the `format` command and instead
// return a Vec with the format results and leave the writing to the caller.
// I think that's undesired because a) we still need a way to tell the formatter
// that it won't be necessary to format the content again and
// b) it would reduce concurrency because the writing would need to wait for the file
// formatting to be complete, unless we use some form of communication channel.
std::fs::write(path, content).expect("Unable to write file");
}
Ok(FormattedFile::Unchanged) => {
// No op
}
Err(FormatError::Query(error)) => {
return Err(error);
}
Err(FormatError::Format(error)) => {
// TODO proper error handling. We should either propagate this error or
// emit a diagnostic (probably this).
tracing::warn!("Failed to format file: {}", error);
}
}
}
Ok(())
}
}

View File

@@ -0,0 +1,252 @@
use ruff_formatter::PrintedRange;
use ruff_text_size::TextRange;
use std::path::Path;
use std::sync::Arc;
use crate::db::{
Database, Db, DbRuntime, HasJar, HasJars, JarsStorage, LintDb, LintJar, ParallelDatabase,
QueryResult, SemanticDb, SemanticJar, Snapshot, SourceDb, SourceJar,
};
use crate::files::{FileId, Files};
use crate::format::{
check_formatted, format_file, format_file_range, FormatDb, FormatError, FormatJar,
FormattedFile,
};
use crate::lint::{lint_semantic, lint_syntax, Diagnostics};
use crate::module::{
add_module, file_to_module, path_to_module, resolve_module, set_module_search_paths, Module,
ModuleData, ModuleName, ModuleSearchPath,
};
use crate::parse::{parse, Parsed};
use crate::source::{source_text, Source};
use crate::symbols::{symbol_table, SymbolId, SymbolTable};
use crate::types::{infer_symbol_type, Type};
use crate::Workspace;
pub mod check;
mod format;
#[derive(Debug)]
pub struct Program {
jars: JarsStorage<Program>,
files: Files,
workspace: Workspace,
}
impl Program {
pub fn new(workspace: Workspace) -> Self {
Self {
jars: JarsStorage::default(),
files: Files::default(),
workspace,
}
}
pub fn apply_changes<I>(&mut self, changes: I)
where
I: IntoIterator<Item = FileChange>,
{
let (source, semantic, lint, format) = self.jars_mut();
for change in changes {
semantic.module_resolver.remove_module(change.id);
semantic.symbol_tables.remove(&change.id);
source.sources.remove(&change.id);
source.parsed.remove(&change.id);
// TODO: remove all dependent modules as well
semantic.type_store.remove_module(change.id);
lint.lint_syntax.remove(&change.id);
lint.lint_semantic.remove(&change.id);
format.formatted.remove(&change.id);
}
}
pub fn files(&self) -> &Files {
&self.files
}
pub fn workspace(&self) -> &Workspace {
&self.workspace
}
pub fn workspace_mut(&mut self) -> &mut Workspace {
&mut self.workspace
}
}
impl SourceDb for Program {
fn file_id(&self, path: &Path) -> FileId {
self.files.intern(path)
}
fn file_path(&self, file_id: FileId) -> Arc<Path> {
self.files.path(file_id)
}
fn source(&self, file_id: FileId) -> QueryResult<Source> {
source_text(self, file_id)
}
fn parse(&self, file_id: FileId) -> QueryResult<Parsed> {
parse(self, file_id)
}
}
impl SemanticDb for Program {
fn resolve_module(&self, name: ModuleName) -> QueryResult<Option<Module>> {
resolve_module(self, name)
}
fn file_to_module(&self, file_id: FileId) -> QueryResult<Option<Module>> {
file_to_module(self, file_id)
}
fn path_to_module(&self, path: &Path) -> QueryResult<Option<Module>> {
path_to_module(self, path)
}
fn symbol_table(&self, file_id: FileId) -> QueryResult<Arc<SymbolTable>> {
symbol_table(self, file_id)
}
fn infer_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> QueryResult<Type> {
infer_symbol_type(self, file_id, symbol_id)
}
// Mutations
fn add_module(&mut self, path: &Path) -> Option<(Module, Vec<Arc<ModuleData>>)> {
add_module(self, path)
}
fn set_module_search_paths(&mut self, paths: Vec<ModuleSearchPath>) {
set_module_search_paths(self, paths);
}
}
impl LintDb for Program {
fn lint_syntax(&self, file_id: FileId) -> QueryResult<Diagnostics> {
lint_syntax(self, file_id)
}
fn lint_semantic(&self, file_id: FileId) -> QueryResult<Diagnostics> {
lint_semantic(self, file_id)
}
}
impl FormatDb for Program {
fn format_file(&self, file_id: FileId) -> Result<FormattedFile, FormatError> {
format_file(self, file_id)
}
fn format_file_range(
&self,
file_id: FileId,
range: TextRange,
) -> Result<PrintedRange, FormatError> {
format_file_range(self, file_id, range)
}
fn check_file_formatted(&self, file_id: FileId) -> Result<Diagnostics, FormatError> {
check_formatted(self, file_id)
}
}
impl Db for Program {}
impl Database for Program {
fn runtime(&self) -> &DbRuntime {
self.jars.runtime()
}
fn runtime_mut(&mut self) -> &mut DbRuntime {
self.jars.runtime_mut()
}
}
impl ParallelDatabase for Program {
fn snapshot(&self) -> Snapshot<Self> {
Snapshot::new(Self {
jars: self.jars.snapshot(),
files: self.files.clone(),
workspace: self.workspace.clone(),
})
}
}
impl HasJars for Program {
type Jars = (SourceJar, SemanticJar, LintJar, FormatJar);
fn jars(&self) -> QueryResult<&Self::Jars> {
self.jars.jars()
}
fn jars_mut(&mut self) -> &mut Self::Jars {
self.jars.jars_mut()
}
}
impl HasJar<SourceJar> for Program {
fn jar(&self) -> QueryResult<&SourceJar> {
Ok(&self.jars()?.0)
}
fn jar_mut(&mut self) -> &mut SourceJar {
&mut self.jars_mut().0
}
}
impl HasJar<SemanticJar> for Program {
fn jar(&self) -> QueryResult<&SemanticJar> {
Ok(&self.jars()?.1)
}
fn jar_mut(&mut self) -> &mut SemanticJar {
&mut self.jars_mut().1
}
}
impl HasJar<LintJar> for Program {
fn jar(&self) -> QueryResult<&LintJar> {
Ok(&self.jars()?.2)
}
fn jar_mut(&mut self) -> &mut LintJar {
&mut self.jars_mut().2
}
}
impl HasJar<FormatJar> for Program {
fn jar(&self) -> QueryResult<&FormatJar> {
Ok(&self.jars()?.3)
}
fn jar_mut(&mut self) -> &mut FormatJar {
&mut self.jars_mut().3
}
}
#[derive(Copy, Clone, Debug)]
pub struct FileChange {
id: FileId,
kind: FileChangeKind,
}
impl FileChange {
pub fn new(file_id: FileId, kind: FileChangeKind) -> Self {
Self { id: file_id, kind }
}
pub fn file_id(&self) -> FileId {
self.id
}
pub fn kind(&self) -> FileChangeKind {
self.kind
}
}
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum FileChangeKind {
Created,
Modified,
Deleted,
}

View File

@@ -0,0 +1,96 @@
use crate::cache::KeyValueCache;
use crate::db::{HasJar, QueryResult, SourceDb, SourceJar};
use ruff_notebook::Notebook;
use ruff_python_ast::PySourceType;
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use crate::files::FileId;
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn source_text<Db>(db: &Db, file_id: FileId) -> QueryResult<Source>
where
Db: SourceDb + HasJar<SourceJar>,
{
let sources = &db.jar()?.sources;
sources.get(&file_id, |file_id| {
let path = db.file_path(*file_id);
let source_text = std::fs::read_to_string(&path).unwrap_or_else(|err| {
tracing::error!("Failed to read file '{path:?}: {err}'. Falling back to empty text");
String::new()
});
let python_ty = PySourceType::from(&path);
let kind = match python_ty {
PySourceType::Python => {
SourceKind::Python(Arc::from(source_text))
}
PySourceType::Stub => SourceKind::Stub(Arc::from(source_text)),
PySourceType::Ipynb => {
let notebook = Notebook::from_source_code(&source_text).unwrap_or_else(|err| {
// TODO should this be changed to never fail?
// or should we instead add a diagnostic somewhere? But what would we return in this case?
tracing::error!(
"Failed to parse notebook '{path:?}: {err}'. Falling back to an empty notebook"
);
Notebook::from_source_code("").unwrap()
});
SourceKind::IpyNotebook(Arc::new(notebook))
}
};
Ok(Source { kind })
})
}
#[derive(Debug, Clone, PartialEq)]
pub enum SourceKind {
Python(Arc<str>),
Stub(Arc<str>),
IpyNotebook(Arc<Notebook>),
}
#[derive(Debug, Clone, PartialEq)]
pub struct Source {
kind: SourceKind,
}
impl Source {
pub fn python<T: Into<Arc<str>>>(source: T) -> Self {
Self {
kind: SourceKind::Python(source.into()),
}
}
pub fn kind(&self) -> &SourceKind {
&self.kind
}
pub fn text(&self) -> &str {
match &self.kind {
SourceKind::Python(text) => text,
SourceKind::Stub(text) => text,
SourceKind::IpyNotebook(notebook) => notebook.source_code(),
}
}
}
#[derive(Debug, Default)]
pub struct SourceStorage(pub(crate) KeyValueCache<FileId, Source>);
impl Deref for SourceStorage {
type Target = KeyValueCache<FileId, Source>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for SourceStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}

View File

@@ -0,0 +1,939 @@
#![allow(dead_code)]
use std::hash::{Hash, Hasher};
use std::iter::{Copied, DoubleEndedIterator, FusedIterator};
use std::num::NonZeroU32;
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use bitflags::bitflags;
use hashbrown::hash_map::{Keys, RawEntryMut};
use rustc_hash::{FxHashMap, FxHasher};
use ruff_index::{newtype_index, IndexVec};
use ruff_python_ast as ast;
use ruff_python_ast::visitor::preorder::PreorderVisitor;
use crate::ast_ids::TypedNodeKey;
use crate::cache::KeyValueCache;
use crate::db::{HasJar, QueryResult, SemanticDb, SemanticJar};
use crate::files::FileId;
use crate::module::ModuleName;
use crate::Name;
#[allow(unreachable_pub)]
#[tracing::instrument(level = "debug", skip(db))]
pub fn symbol_table<Db>(db: &Db, file_id: FileId) -> QueryResult<Arc<SymbolTable>>
where
Db: SemanticDb + HasJar<SemanticJar>,
{
let jar = db.jar()?;
jar.symbol_tables.get(&file_id, |_| {
let parsed = db.parse(file_id)?;
Ok(Arc::from(SymbolTable::from_ast(parsed.ast())))
})
}
type Map<K, V> = hashbrown::HashMap<K, V, ()>;
#[newtype_index]
pub(crate) struct ScopeId;
impl ScopeId {
pub(crate) fn scope(self, table: &SymbolTable) -> &Scope {
&table.scopes_by_id[self]
}
}
#[newtype_index]
pub struct SymbolId;
impl SymbolId {
pub(crate) fn symbol(self, table: &SymbolTable) -> &Symbol {
&table.symbols_by_id[self]
}
}
#[derive(Copy, Clone, Debug, PartialEq)]
pub(crate) enum ScopeKind {
Module,
Annotation,
Class,
Function,
}
#[derive(Debug)]
pub(crate) struct Scope {
name: Name,
kind: ScopeKind,
child_scopes: Vec<ScopeId>,
// symbol IDs, hashed by symbol name
symbols_by_name: Map<SymbolId, ()>,
}
impl Scope {
pub(crate) fn name(&self) -> &str {
self.name.as_str()
}
pub(crate) fn kind(&self) -> ScopeKind {
self.kind
}
}
#[derive(Debug)]
pub(crate) enum Kind {
FreeVar,
CellVar,
CellVarAssigned,
ExplicitGlobal,
ImplicitGlobal,
}
bitflags! {
#[derive(Copy,Clone,Debug)]
pub(crate) struct SymbolFlags: u8 {
const IS_USED = 1 << 0;
const IS_DEFINED = 1 << 1;
/// TODO: This flag is not yet set by anything
const MARKED_GLOBAL = 1 << 2;
/// TODO: This flag is not yet set by anything
const MARKED_NONLOCAL = 1 << 3;
}
}
#[derive(Debug)]
pub(crate) struct Symbol {
name: Name,
flags: SymbolFlags,
// kind: Kind,
}
impl Symbol {
pub(crate) fn name(&self) -> &str {
self.name.as_str()
}
/// Is the symbol used in its containing scope?
pub(crate) fn is_used(&self) -> bool {
self.flags.contains(SymbolFlags::IS_USED)
}
/// Is the symbol defined in its containing scope?
pub(crate) fn is_defined(&self) -> bool {
self.flags.contains(SymbolFlags::IS_DEFINED)
}
// TODO: implement Symbol.kind 2-pass analysis to categorize as: free-var, cell-var,
// explicit-global, implicit-global and implement Symbol.kind by modifying the preorder
// traversal code
}
// TODO storing TypedNodeKey for definitions means we have to search to find them again in the AST;
// this is at best O(log n). If looking up definitions is a bottleneck we should look for
// alternatives here.
#[derive(Clone, Debug)]
pub(crate) enum Definition {
// For the import cases, we don't need reference to any arbitrary AST subtrees (annotations,
// RHS), and referencing just the import statement node is imprecise (a single import statement
// can assign many symbols, we'd have to re-search for the one we care about), so we just copy
// the small amount of information we need from the AST.
Import(ImportDefinition),
ImportFrom(ImportFromDefinition),
ClassDef(TypedNodeKey<ast::StmtClassDef>),
FunctionDef(TypedNodeKey<ast::StmtFunctionDef>),
Assignment(TypedNodeKey<ast::StmtAssign>),
AnnotatedAssignment(TypedNodeKey<ast::StmtAnnAssign>),
// TODO with statements, except handlers, function args...
}
#[derive(Clone, Debug)]
pub(crate) struct ImportDefinition {
pub(crate) module: ModuleName,
}
#[derive(Clone, Debug)]
pub(crate) struct ImportFromDefinition {
pub(crate) module: Option<ModuleName>,
pub(crate) name: Name,
pub(crate) level: u32,
}
impl ImportFromDefinition {
pub(crate) fn module(&self) -> Option<&ModuleName> {
self.module.as_ref()
}
pub(crate) fn name(&self) -> &Name {
&self.name
}
pub(crate) fn level(&self) -> u32 {
self.level
}
}
#[derive(Debug, Clone)]
pub enum Dependency {
Module(ModuleName),
Relative {
level: NonZeroU32,
module: Option<ModuleName>,
},
}
/// Table of all symbols in all scopes for a module.
#[derive(Debug)]
pub struct SymbolTable {
scopes_by_id: IndexVec<ScopeId, Scope>,
symbols_by_id: IndexVec<SymbolId, Symbol>,
defs: FxHashMap<SymbolId, Vec<Definition>>,
dependencies: Vec<Dependency>,
}
impl SymbolTable {
pub(crate) fn from_ast(module: &ast::ModModule) -> Self {
let root_scope_id = SymbolTable::root_scope_id();
let mut builder = SymbolTableBuilder {
table: SymbolTable::new(),
scopes: vec![root_scope_id],
current_definition: None,
};
builder.visit_body(&module.body);
builder.table
}
pub(crate) fn new() -> Self {
let mut table = SymbolTable {
scopes_by_id: IndexVec::new(),
symbols_by_id: IndexVec::new(),
defs: FxHashMap::default(),
dependencies: Vec::new(),
};
table.scopes_by_id.push(Scope {
name: Name::new("<module>"),
kind: ScopeKind::Module,
child_scopes: Vec::new(),
symbols_by_name: Map::default(),
});
table
}
pub(crate) fn dependencies(&self) -> &[Dependency] {
&self.dependencies
}
pub(crate) const fn root_scope_id() -> ScopeId {
ScopeId::from_usize(0)
}
pub(crate) fn root_scope(&self) -> &Scope {
&self.scopes_by_id[SymbolTable::root_scope_id()]
}
pub(crate) fn symbol_ids_for_scope(&self, scope_id: ScopeId) -> Copied<Keys<SymbolId, ()>> {
self.scopes_by_id[scope_id].symbols_by_name.keys().copied()
}
pub(crate) fn symbols_for_scope(
&self,
scope_id: ScopeId,
) -> SymbolIterator<Copied<Keys<SymbolId, ()>>> {
SymbolIterator {
table: self,
ids: self.symbol_ids_for_scope(scope_id),
}
}
pub(crate) fn root_symbol_ids(&self) -> Copied<Keys<SymbolId, ()>> {
self.symbol_ids_for_scope(SymbolTable::root_scope_id())
}
pub(crate) fn root_symbols(&self) -> SymbolIterator<Copied<Keys<SymbolId, ()>>> {
self.symbols_for_scope(SymbolTable::root_scope_id())
}
pub(crate) fn child_scope_ids_of(&self, scope_id: ScopeId) -> &[ScopeId] {
&self.scopes_by_id[scope_id].child_scopes
}
pub(crate) fn child_scopes_of(&self, scope_id: ScopeId) -> ScopeIterator<&[ScopeId]> {
ScopeIterator {
table: self,
ids: self.child_scope_ids_of(scope_id),
}
}
pub(crate) fn root_child_scope_ids(&self) -> &[ScopeId] {
self.child_scope_ids_of(SymbolTable::root_scope_id())
}
pub(crate) fn root_child_scopes(&self) -> ScopeIterator<&[ScopeId]> {
self.child_scopes_of(SymbolTable::root_scope_id())
}
pub(crate) fn symbol_id_by_name(&self, scope_id: ScopeId, name: &str) -> Option<SymbolId> {
let scope = &self.scopes_by_id[scope_id];
let hash = SymbolTable::hash_name(name);
let name = Name::new(name);
scope
.symbols_by_name
.raw_entry()
.from_hash(hash, |symid| self.symbols_by_id[*symid].name == name)
.map(|(symbol_id, ())| *symbol_id)
}
pub(crate) fn symbol_by_name(&self, scope_id: ScopeId, name: &str) -> Option<&Symbol> {
Some(&self.symbols_by_id[self.symbol_id_by_name(scope_id, name)?])
}
pub(crate) fn root_symbol_id_by_name(&self, name: &str) -> Option<SymbolId> {
self.symbol_id_by_name(SymbolTable::root_scope_id(), name)
}
pub(crate) fn root_symbol_by_name(&self, name: &str) -> Option<&Symbol> {
self.symbol_by_name(SymbolTable::root_scope_id(), name)
}
pub(crate) fn definitions(&self, symbol_id: SymbolId) -> &[Definition] {
self.defs
.get(&symbol_id)
.map(std::vec::Vec::as_slice)
.unwrap_or_default()
}
pub(crate) fn all_definitions(&self) -> impl Iterator<Item = (SymbolId, &Definition)> + '_ {
self.defs
.iter()
.flat_map(|(sym_id, defs)| defs.iter().map(move |def| (*sym_id, def)))
}
fn add_or_update_symbol(
&mut self,
scope_id: ScopeId,
name: &str,
flags: SymbolFlags,
) -> SymbolId {
let hash = SymbolTable::hash_name(name);
let scope = &mut self.scopes_by_id[scope_id];
let name = Name::new(name);
let entry = scope
.symbols_by_name
.raw_entry_mut()
.from_hash(hash, |existing| self.symbols_by_id[*existing].name == name);
match entry {
RawEntryMut::Occupied(entry) => {
if let Some(symbol) = self.symbols_by_id.get_mut(*entry.key()) {
symbol.flags.insert(flags);
};
*entry.key()
}
RawEntryMut::Vacant(entry) => {
let id = self.symbols_by_id.push(Symbol { name, flags });
entry.insert_with_hasher(hash, id, (), |_| hash);
id
}
}
}
fn add_child_scope(
&mut self,
parent_scope_id: ScopeId,
name: &str,
kind: ScopeKind,
) -> ScopeId {
let new_scope_id = self.scopes_by_id.push(Scope {
name: Name::new(name),
kind,
child_scopes: Vec::new(),
symbols_by_name: Map::default(),
});
let parent_scope = &mut self.scopes_by_id[parent_scope_id];
parent_scope.child_scopes.push(new_scope_id);
new_scope_id
}
fn hash_name(name: &str) -> u64 {
let mut hasher = FxHasher::default();
name.hash(&mut hasher);
hasher.finish()
}
}
pub(crate) struct SymbolIterator<'a, I> {
table: &'a SymbolTable,
ids: I,
}
impl<'a, I> Iterator for SymbolIterator<'a, I>
where
I: Iterator<Item = SymbolId>,
{
type Item = &'a Symbol;
fn next(&mut self) -> Option<Self::Item> {
let id = self.ids.next()?;
Some(&self.table.symbols_by_id[id])
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.ids.size_hint()
}
}
impl<'a, I> FusedIterator for SymbolIterator<'a, I> where
I: Iterator<Item = SymbolId> + FusedIterator
{
}
impl<'a, I> DoubleEndedIterator for SymbolIterator<'a, I>
where
I: Iterator<Item = SymbolId> + DoubleEndedIterator,
{
fn next_back(&mut self) -> Option<Self::Item> {
let id = self.ids.next_back()?;
Some(&self.table.symbols_by_id[id])
}
}
pub(crate) struct ScopeIterator<'a, I> {
table: &'a SymbolTable,
ids: I,
}
impl<'a, I> Iterator for ScopeIterator<'a, I>
where
I: Iterator<Item = ScopeId>,
{
type Item = &'a Scope;
fn next(&mut self) -> Option<Self::Item> {
let id = self.ids.next()?;
Some(&self.table.scopes_by_id[id])
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.ids.size_hint()
}
}
impl<'a, I> FusedIterator for ScopeIterator<'a, I> where I: Iterator<Item = ScopeId> + FusedIterator {}
impl<'a, I> DoubleEndedIterator for ScopeIterator<'a, I>
where
I: Iterator<Item = ScopeId> + DoubleEndedIterator,
{
fn next_back(&mut self) -> Option<Self::Item> {
let id = self.ids.next_back()?;
Some(&self.table.scopes_by_id[id])
}
}
struct SymbolTableBuilder {
table: SymbolTable,
scopes: Vec<ScopeId>,
/// the definition whose target(s) we are currently walking
current_definition: Option<Definition>,
}
impl SymbolTableBuilder {
fn add_or_update_symbol(&mut self, identifier: &str, flags: SymbolFlags) -> SymbolId {
self.table
.add_or_update_symbol(self.cur_scope(), identifier, flags)
}
fn add_or_update_symbol_with_def(
&mut self,
identifier: &str,
definition: Definition,
) -> SymbolId {
let symbol_id = self.add_or_update_symbol(identifier, SymbolFlags::IS_DEFINED);
self.table
.defs
.entry(symbol_id)
.or_default()
.push(definition);
symbol_id
}
fn push_scope(&mut self, child_of: ScopeId, name: &str, kind: ScopeKind) -> ScopeId {
let scope_id = self.table.add_child_scope(child_of, name, kind);
self.scopes.push(scope_id);
scope_id
}
fn pop_scope(&mut self) -> ScopeId {
self.scopes
.pop()
.expect("Scope stack should never be empty")
}
fn cur_scope(&self) -> ScopeId {
*self
.scopes
.last()
.expect("Scope stack should never be empty")
}
fn with_type_params(
&mut self,
name: &str,
params: &Option<Box<ast::TypeParams>>,
nested: impl FnOnce(&mut Self),
) {
if let Some(type_params) = params {
self.push_scope(self.cur_scope(), name, ScopeKind::Annotation);
for type_param in &type_params.type_params {
let name = match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar { name, .. }) => name,
ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { name, .. }) => name,
ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { name, .. }) => name,
};
self.add_or_update_symbol(name, SymbolFlags::IS_DEFINED);
}
}
nested(self);
if params.is_some() {
self.pop_scope();
}
}
}
impl PreorderVisitor<'_> for SymbolTableBuilder {
fn visit_expr(&mut self, expr: &ast::Expr) {
if let ast::Expr::Name(ast::ExprName { id, ctx, .. }) = expr {
let flags = match ctx {
ast::ExprContext::Load => SymbolFlags::IS_USED,
ast::ExprContext::Store => SymbolFlags::IS_DEFINED,
ast::ExprContext::Del => SymbolFlags::IS_DEFINED,
ast::ExprContext::Invalid => SymbolFlags::empty(),
};
self.add_or_update_symbol(id, flags);
if flags.contains(SymbolFlags::IS_DEFINED) {
if let Some(curdef) = self.current_definition.clone() {
self.add_or_update_symbol_with_def(id, curdef);
}
}
}
ast::visitor::preorder::walk_expr(self, expr);
}
fn visit_stmt(&mut self, stmt: &ast::Stmt) {
// TODO need to capture more definition statements here
match stmt {
ast::Stmt::ClassDef(node) => {
let def = Definition::ClassDef(TypedNodeKey::from_node(node));
self.add_or_update_symbol_with_def(&node.name, def);
self.with_type_params(&node.name, &node.type_params, |builder| {
builder.push_scope(builder.cur_scope(), &node.name, ScopeKind::Class);
ast::visitor::preorder::walk_stmt(builder, stmt);
builder.pop_scope();
});
}
ast::Stmt::FunctionDef(node) => {
let def = Definition::FunctionDef(TypedNodeKey::from_node(node));
self.add_or_update_symbol_with_def(&node.name, def);
self.with_type_params(&node.name, &node.type_params, |builder| {
builder.push_scope(builder.cur_scope(), &node.name, ScopeKind::Function);
ast::visitor::preorder::walk_stmt(builder, stmt);
builder.pop_scope();
});
}
ast::Stmt::Import(ast::StmtImport { names, .. }) => {
for alias in names {
let symbol_name = if let Some(asname) = &alias.asname {
asname.id.as_str()
} else {
alias.name.id.split('.').next().unwrap()
};
let module = ModuleName::new(&alias.name.id);
let def = Definition::Import(ImportDefinition {
module: module.clone(),
});
self.add_or_update_symbol_with_def(symbol_name, def);
self.table.dependencies.push(Dependency::Module(module));
}
}
ast::Stmt::ImportFrom(ast::StmtImportFrom {
module,
names,
level,
..
}) => {
let module = module.as_ref().map(|m| ModuleName::new(&m.id));
for alias in names {
let symbol_name = if let Some(asname) = &alias.asname {
asname.id.as_str()
} else {
alias.name.id.as_str()
};
let def = Definition::ImportFrom(ImportFromDefinition {
module: module.clone(),
name: Name::new(&alias.name.id),
level: *level,
});
self.add_or_update_symbol_with_def(symbol_name, def);
}
let dependency = if let Some(module) = module {
match NonZeroU32::new(*level) {
Some(level) => Dependency::Relative {
level,
module: Some(module),
},
None => Dependency::Module(module),
}
} else {
Dependency::Relative {
level: NonZeroU32::new(*level)
.expect("Import without a module to have a level > 0"),
module,
}
};
self.table.dependencies.push(dependency);
}
ast::Stmt::Assign(node) => {
debug_assert!(self.current_definition.is_none());
self.current_definition =
Some(Definition::Assignment(TypedNodeKey::from_node(node)));
ast::visitor::preorder::walk_stmt(self, stmt);
self.current_definition = None;
}
_ => {
ast::visitor::preorder::walk_stmt(self, stmt);
}
}
}
}
#[derive(Debug, Default)]
pub struct SymbolTablesStorage(KeyValueCache<FileId, Arc<SymbolTable>>);
impl Deref for SymbolTablesStorage {
type Target = KeyValueCache<FileId, Arc<SymbolTable>>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for SymbolTablesStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
#[cfg(test)]
mod tests {
use textwrap::dedent;
use crate::parse::Parsed;
use crate::symbols::ScopeKind;
use super::{SymbolFlags, SymbolId, SymbolIterator, SymbolTable};
mod from_ast {
use super::*;
fn parse(code: &str) -> Parsed {
Parsed::from_text(&dedent(code))
}
fn names<I>(it: SymbolIterator<I>) -> Vec<&str>
where
I: Iterator<Item = SymbolId>,
{
let mut symbols: Vec<_> = it.map(|sym| sym.name.as_str()).collect();
symbols.sort_unstable();
symbols
}
#[test]
fn empty() {
let parsed = parse("");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()).len(), 0);
}
#[test]
fn simple() {
let parsed = parse("x");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("x").unwrap())
.len(),
0
);
}
#[test]
fn annotation_only() {
let parsed = parse("x: int");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["int", "x"]);
// TODO record definition
}
#[test]
fn import() {
let parsed = parse("import foo");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("foo").unwrap())
.len(),
1
);
}
#[test]
fn import_sub() {
let parsed = parse("import foo.bar");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo"]);
}
#[test]
fn import_as() {
let parsed = parse("import foo.bar as baz");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["baz"]);
}
#[test]
fn import_from() {
let parsed = parse("from bar import foo");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("foo").unwrap())
.len(),
1
);
assert!(
table.root_symbol_id_by_name("foo").is_some_and(|sid| {
let s = sid.symbol(&table);
s.is_defined() || !s.is_used()
}),
"symbols that are defined get the defined flag"
);
}
#[test]
fn assign() {
let parsed = parse("x = foo");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo", "x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("x").unwrap())
.len(),
1
);
assert!(
table.root_symbol_id_by_name("foo").is_some_and(|sid| {
let s = sid.symbol(&table);
!s.is_defined() && s.is_used()
}),
"a symbol used but not defined in a scope should have only the used flag"
);
}
#[test]
fn class_scope() {
let parsed = parse(
"
class C:
x = 1
y = 2
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["C", "y"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let c_scope = scopes[0].scope(&table);
assert_eq!(c_scope.kind(), ScopeKind::Class);
assert_eq!(c_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("C").unwrap())
.len(),
1
);
}
#[test]
fn func_scope() {
let parsed = parse(
"
def func():
x = 1
y = 2
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["func", "y"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let func_scope = scopes[0].scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Function);
assert_eq!(func_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("func").unwrap())
.len(),
1
);
}
#[test]
fn dupes() {
let parsed = parse(
"
def func():
x = 1
def func():
y = 2
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["func"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 2);
let func_scope_1 = scopes[0].scope(&table);
let func_scope_2 = scopes[1].scope(&table);
assert_eq!(func_scope_1.kind(), ScopeKind::Function);
assert_eq!(func_scope_1.name(), "func");
assert_eq!(func_scope_2.kind(), ScopeKind::Function);
assert_eq!(func_scope_2.name(), "func");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(names(table.symbols_for_scope(scopes[1])), vec!["y"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("func").unwrap())
.len(),
2
);
}
#[test]
fn generic_func() {
let parsed = parse(
"
def func[T]():
x = 1
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["func"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let ann_scope_id = scopes[0];
let ann_scope = ann_scope_id.scope(&table);
assert_eq!(ann_scope.kind(), ScopeKind::Annotation);
assert_eq!(ann_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(ann_scope_id)), vec!["T"]);
let scopes = table.child_scope_ids_of(ann_scope_id);
assert_eq!(scopes.len(), 1);
let func_scope_id = scopes[0];
let func_scope = func_scope_id.scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Function);
assert_eq!(func_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(func_scope_id)), vec!["x"]);
}
#[test]
fn generic_class() {
let parsed = parse(
"
class C[T]:
x = 1
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["C"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let ann_scope_id = scopes[0];
let ann_scope = ann_scope_id.scope(&table);
assert_eq!(ann_scope.kind(), ScopeKind::Annotation);
assert_eq!(ann_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(ann_scope_id)), vec!["T"]);
assert!(
table
.symbol_by_name(ann_scope_id, "T")
.is_some_and(|s| s.is_defined() && !s.is_used()),
"type parameters are defined by the scope that introduces them"
);
let scopes = table.child_scope_ids_of(ann_scope_id);
assert_eq!(scopes.len(), 1);
let func_scope_id = scopes[0];
let func_scope = func_scope_id.scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Class);
assert_eq!(func_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(func_scope_id)), vec!["x"]);
}
}
#[test]
fn insert_same_name_symbol_twice() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let symbol_id_1 = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::IS_DEFINED);
let symbol_id_2 = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::IS_USED);
assert_eq!(symbol_id_1, symbol_id_2);
assert!(symbol_id_1.symbol(&table).is_used(), "flags must merge");
assert!(symbol_id_1.symbol(&table).is_defined(), "flags must merge");
}
#[test]
fn insert_different_named_symbols() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let symbol_id_1 = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let symbol_id_2 = table.add_or_update_symbol(root_scope_id, "bar", SymbolFlags::empty());
assert_ne!(symbol_id_1, symbol_id_2);
}
#[test]
fn add_child_scope_with_symbol() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let foo_symbol_top = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let c_scope = table.add_child_scope(root_scope_id, "C", ScopeKind::Class);
let foo_symbol_inner = table.add_or_update_symbol(c_scope, "foo", SymbolFlags::empty());
assert_ne!(foo_symbol_top, foo_symbol_inner);
}
#[test]
fn scope_from_id() {
let table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let scope = root_scope_id.scope(&table);
assert_eq!(scope.name.as_str(), "<module>");
assert_eq!(scope.kind, ScopeKind::Module);
}
#[test]
fn symbol_from_id() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let foo_symbol_id = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let symbol = foo_symbol_id.symbol(&table);
assert_eq!(symbol.name.as_str(), "foo");
}
}

View File

@@ -0,0 +1,560 @@
#![allow(dead_code)]
use crate::ast_ids::NodeKey;
use crate::files::FileId;
use crate::symbols::SymbolId;
use crate::{FxDashMap, FxIndexSet, Name};
use ruff_index::{newtype_index, IndexVec};
use rustc_hash::FxHashMap;
pub(crate) mod infer;
pub(crate) use infer::infer_symbol_type;
/// unique ID for a type
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
pub enum Type {
/// the dynamic or gradual type: a statically-unknown set of values
Any,
/// the empty set of values
Never,
/// unknown type (no annotation)
/// equivalent to Any, or to object in strict mode
Unknown,
/// name is not bound to any value
Unbound,
/// a specific function object
Function(FunctionTypeId),
/// a specific class object
Class(ClassTypeId),
/// the set of Python objects with the given class in their __class__'s method resolution order
Instance(ClassTypeId),
Union(UnionTypeId),
Intersection(IntersectionTypeId),
// TODO protocols, callable types, overloads, generics, type vars
}
impl Type {
fn display<'a>(&'a self, store: &'a TypeStore) -> DisplayType<'a> {
DisplayType { ty: self, store }
}
pub const fn is_unbound(&self) -> bool {
matches!(self, Type::Unbound)
}
pub const fn is_unknown(&self) -> bool {
matches!(self, Type::Unknown)
}
}
impl From<FunctionTypeId> for Type {
fn from(id: FunctionTypeId) -> Self {
Type::Function(id)
}
}
impl From<UnionTypeId> for Type {
fn from(id: UnionTypeId) -> Self {
Type::Union(id)
}
}
impl From<IntersectionTypeId> for Type {
fn from(id: IntersectionTypeId) -> Self {
Type::Intersection(id)
}
}
// TODO: currently calling `get_function` et al and holding on to the `FunctionTypeRef` will lock a
// shard of this dashmap, for as long as you hold the reference. This may be a problem. We could
// switch to having all the arenas hold Arc, or we could see if we can split up ModuleTypeStore,
// and/or give it inner mutability and finer-grained internal locking.
#[derive(Debug, Default)]
pub struct TypeStore {
modules: FxDashMap<FileId, ModuleTypeStore>,
}
impl TypeStore {
pub fn remove_module(&mut self, file_id: FileId) {
self.modules.remove(&file_id);
}
pub fn cache_symbol_type(&self, file_id: FileId, symbol_id: SymbolId, ty: Type) {
self.add_or_get_module(file_id)
.symbol_types
.insert(symbol_id, ty);
}
pub fn cache_node_type(&self, file_id: FileId, node_key: NodeKey, ty: Type) {
self.add_or_get_module(file_id)
.node_types
.insert(node_key, ty);
}
pub fn get_cached_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> Option<Type> {
self.try_get_module(file_id)?
.symbol_types
.get(&symbol_id)
.copied()
}
pub fn get_cached_node_type(&self, file_id: FileId, node_key: &NodeKey) -> Option<Type> {
self.try_get_module(file_id)?
.node_types
.get(node_key)
.copied()
}
fn add_or_get_module(&self, file_id: FileId) -> ModuleStoreRefMut {
self.modules
.entry(file_id)
.or_insert_with(|| ModuleTypeStore::new(file_id))
}
fn get_module(&self, file_id: FileId) -> ModuleStoreRef {
self.try_get_module(file_id).expect("module should exist")
}
fn try_get_module(&self, file_id: FileId) -> Option<ModuleStoreRef> {
self.modules.get(&file_id)
}
fn add_function(&self, file_id: FileId, name: &str) -> FunctionTypeId {
self.add_or_get_module(file_id).add_function(name)
}
fn add_class(&self, file_id: FileId, name: &str, bases: Vec<Type>) -> ClassTypeId {
self.add_or_get_module(file_id).add_class(name, bases)
}
fn add_union(&mut self, file_id: FileId, elems: &[Type]) -> UnionTypeId {
self.add_or_get_module(file_id).add_union(elems)
}
fn add_intersection(
&mut self,
file_id: FileId,
positive: &[Type],
negative: &[Type],
) -> IntersectionTypeId {
self.add_or_get_module(file_id)
.add_intersection(positive, negative)
}
fn get_function(&self, id: FunctionTypeId) -> FunctionTypeRef {
FunctionTypeRef {
module_store: self.get_module(id.file_id),
function_id: id.func_id,
}
}
fn get_class(&self, id: ClassTypeId) -> ClassTypeRef {
ClassTypeRef {
module_store: self.get_module(id.file_id),
class_id: id.class_id,
}
}
fn get_union(&self, id: UnionTypeId) -> UnionTypeRef {
UnionTypeRef {
module_store: self.get_module(id.file_id),
union_id: id.union_id,
}
}
fn get_intersection(&self, id: IntersectionTypeId) -> IntersectionTypeRef {
IntersectionTypeRef {
module_store: self.get_module(id.file_id),
intersection_id: id.intersection_id,
}
}
}
type ModuleStoreRef<'a> = dashmap::mapref::one::Ref<
'a,
FileId,
ModuleTypeStore,
std::hash::BuildHasherDefault<rustc_hash::FxHasher>,
>;
type ModuleStoreRefMut<'a> = dashmap::mapref::one::RefMut<
'a,
FileId,
ModuleTypeStore,
std::hash::BuildHasherDefault<rustc_hash::FxHasher>,
>;
#[derive(Debug)]
pub(crate) struct FunctionTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
function_id: ModuleFunctionTypeId,
}
impl<'a> std::ops::Deref for FunctionTypeRef<'a> {
type Target = FunctionType;
fn deref(&self) -> &Self::Target {
self.module_store.get_function(self.function_id)
}
}
#[derive(Debug)]
pub(crate) struct ClassTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
class_id: ModuleClassTypeId,
}
impl<'a> std::ops::Deref for ClassTypeRef<'a> {
type Target = ClassType;
fn deref(&self) -> &Self::Target {
self.module_store.get_class(self.class_id)
}
}
#[derive(Debug)]
pub(crate) struct UnionTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
union_id: ModuleUnionTypeId,
}
impl<'a> std::ops::Deref for UnionTypeRef<'a> {
type Target = UnionType;
fn deref(&self) -> &Self::Target {
self.module_store.get_union(self.union_id)
}
}
#[derive(Debug)]
pub(crate) struct IntersectionTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
intersection_id: ModuleIntersectionTypeId,
}
impl<'a> std::ops::Deref for IntersectionTypeRef<'a> {
type Target = IntersectionType;
fn deref(&self) -> &Self::Target {
self.module_store.get_intersection(self.intersection_id)
}
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct FunctionTypeId {
file_id: FileId,
func_id: ModuleFunctionTypeId,
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct ClassTypeId {
file_id: FileId,
class_id: ModuleClassTypeId,
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct UnionTypeId {
file_id: FileId,
union_id: ModuleUnionTypeId,
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct IntersectionTypeId {
file_id: FileId,
intersection_id: ModuleIntersectionTypeId,
}
#[newtype_index]
struct ModuleFunctionTypeId;
#[newtype_index]
struct ModuleClassTypeId;
#[newtype_index]
struct ModuleUnionTypeId;
#[newtype_index]
struct ModuleIntersectionTypeId;
#[derive(Debug)]
struct ModuleTypeStore {
file_id: FileId,
/// arena of all function types defined in this module
functions: IndexVec<ModuleFunctionTypeId, FunctionType>,
/// arena of all class types defined in this module
classes: IndexVec<ModuleClassTypeId, ClassType>,
/// arenda of all union types created in this module
unions: IndexVec<ModuleUnionTypeId, UnionType>,
/// arena of all intersection types created in this module
intersections: IndexVec<ModuleIntersectionTypeId, IntersectionType>,
/// cached types of symbols in this module
symbol_types: FxHashMap<SymbolId, Type>,
/// cached types of AST nodes in this module
node_types: FxHashMap<NodeKey, Type>,
}
impl ModuleTypeStore {
fn new(file_id: FileId) -> Self {
Self {
file_id,
functions: IndexVec::default(),
classes: IndexVec::default(),
unions: IndexVec::default(),
intersections: IndexVec::default(),
symbol_types: FxHashMap::default(),
node_types: FxHashMap::default(),
}
}
fn add_function(&mut self, name: &str) -> FunctionTypeId {
let func_id = self.functions.push(FunctionType {
name: Name::new(name),
});
FunctionTypeId {
file_id: self.file_id,
func_id,
}
}
fn add_class(&mut self, name: &str, bases: Vec<Type>) -> ClassTypeId {
let class_id = self.classes.push(ClassType {
name: Name::new(name),
// TODO: if no bases are given, that should imply [object]
bases,
});
ClassTypeId {
file_id: self.file_id,
class_id,
}
}
fn add_union(&mut self, elems: &[Type]) -> UnionTypeId {
let union_id = self.unions.push(UnionType {
elements: elems.iter().copied().collect(),
});
UnionTypeId {
file_id: self.file_id,
union_id,
}
}
fn add_intersection(&mut self, positive: &[Type], negative: &[Type]) -> IntersectionTypeId {
let intersection_id = self.intersections.push(IntersectionType {
positive: positive.iter().copied().collect(),
negative: negative.iter().copied().collect(),
});
IntersectionTypeId {
file_id: self.file_id,
intersection_id,
}
}
fn get_function(&self, func_id: ModuleFunctionTypeId) -> &FunctionType {
&self.functions[func_id]
}
fn get_class(&self, class_id: ModuleClassTypeId) -> &ClassType {
&self.classes[class_id]
}
fn get_union(&self, union_id: ModuleUnionTypeId) -> &UnionType {
&self.unions[union_id]
}
fn get_intersection(&self, intersection_id: ModuleIntersectionTypeId) -> &IntersectionType {
&self.intersections[intersection_id]
}
}
#[derive(Copy, Clone, Debug)]
struct DisplayType<'a> {
ty: &'a Type,
store: &'a TypeStore,
}
impl std::fmt::Display for DisplayType<'_> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self.ty {
Type::Any => f.write_str("Any"),
Type::Never => f.write_str("Never"),
Type::Unknown => f.write_str("Unknown"),
Type::Unbound => f.write_str("Unbound"),
// TODO functions and classes should display using a fully qualified name
Type::Class(class_id) => {
f.write_str("Literal[")?;
f.write_str(self.store.get_class(*class_id).name())?;
f.write_str("]")
}
Type::Instance(class_id) => f.write_str(self.store.get_class(*class_id).name()),
Type::Function(func_id) => f.write_str(self.store.get_function(*func_id).name()),
Type::Union(union_id) => self
.store
.get_module(union_id.file_id)
.get_union(union_id.union_id)
.display(f, self.store),
Type::Intersection(int_id) => self
.store
.get_module(int_id.file_id)
.get_intersection(int_id.intersection_id)
.display(f, self.store),
}
}
}
#[derive(Debug)]
pub(crate) struct ClassType {
name: Name,
bases: Vec<Type>,
}
impl ClassType {
fn name(&self) -> &str {
self.name.as_str()
}
fn bases(&self) -> &[Type] {
self.bases.as_slice()
}
}
#[derive(Debug)]
pub(crate) struct FunctionType {
name: Name,
}
impl FunctionType {
fn name(&self) -> &str {
self.name.as_str()
}
}
#[derive(Debug)]
pub(crate) struct UnionType {
// the union type includes values in any of these types
elements: FxIndexSet<Type>,
}
impl UnionType {
fn display(&self, f: &mut std::fmt::Formatter<'_>, store: &TypeStore) -> std::fmt::Result {
f.write_str("(")?;
let mut first = true;
for ty in &self.elements {
if !first {
f.write_str(" | ")?;
};
first = false;
write!(f, "{}", ty.display(store))?;
}
f.write_str(")")
}
}
// Negation types aren't expressible in annotations, and are most likely to arise from type
// narrowing along with intersections (e.g. `if not isinstance(...)`), so we represent them
// directly in intersections rather than as a separate type. This sacrifices some efficiency in the
// case where a Not appears outside an intersection (unclear when that could even happen, but we'd
// have to represent it as a single-element intersection if it did) in exchange for better
// efficiency in the not-within-intersection case.
#[derive(Debug)]
pub(crate) struct IntersectionType {
// the intersection type includes only values in all of these types
positive: FxIndexSet<Type>,
// negated elements of the intersection, e.g.
negative: FxIndexSet<Type>,
}
impl IntersectionType {
fn display(&self, f: &mut std::fmt::Formatter<'_>, store: &TypeStore) -> std::fmt::Result {
f.write_str("(")?;
let mut first = true;
for (neg, ty) in self
.positive
.iter()
.map(|ty| (false, ty))
.chain(self.negative.iter().map(|ty| (true, ty)))
{
if !first {
f.write_str(" & ")?;
};
first = false;
if neg {
f.write_str("~")?;
};
write!(f, "{}", ty.display(store))?;
}
f.write_str(")")
}
}
#[cfg(test)]
mod tests {
use crate::files::Files;
use crate::types::{Type, TypeStore};
use crate::FxIndexSet;
use std::path::Path;
#[test]
fn add_class() {
let store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let id = store.add_class(file_id, "C", Vec::new());
assert_eq!(store.get_class(id).name(), "C");
let inst = Type::Instance(id);
assert_eq!(format!("{}", inst.display(&store)), "C");
}
#[test]
fn add_function() {
let store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let id = store.add_function(file_id, "func");
assert_eq!(store.get_function(id).name(), "func");
let func = Type::Function(id);
assert_eq!(format!("{}", func.display(&store)), "func");
}
#[test]
fn add_union() {
let mut store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let c1 = store.add_class(file_id, "C1", Vec::new());
let c2 = store.add_class(file_id, "C2", Vec::new());
let elems = vec![Type::Instance(c1), Type::Instance(c2)];
let id = store.add_union(file_id, &elems);
assert_eq!(
store.get_union(id).elements,
elems.into_iter().collect::<FxIndexSet<_>>()
);
let union = Type::Union(id);
assert_eq!(format!("{}", union.display(&store)), "(C1 | C2)");
}
#[test]
fn add_intersection() {
let mut store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let c1 = store.add_class(file_id, "C1", Vec::new());
let c2 = store.add_class(file_id, "C2", Vec::new());
let c3 = store.add_class(file_id, "C3", Vec::new());
let pos = vec![Type::Instance(c1), Type::Instance(c2)];
let neg = vec![Type::Instance(c3)];
let id = store.add_intersection(file_id, &pos, &neg);
assert_eq!(
store.get_intersection(id).positive,
pos.into_iter().collect::<FxIndexSet<_>>()
);
assert_eq!(
store.get_intersection(id).negative,
neg.into_iter().collect::<FxIndexSet<_>>()
);
let intersection = Type::Intersection(id);
assert_eq!(
format!("{}", intersection.display(&store)),
"(C1 & C2 & ~C3)"
);
}
}

View File

@@ -0,0 +1,217 @@
#![allow(dead_code)]
use ruff_python_ast::AstNode;
use crate::db::{HasJar, QueryResult, SemanticDb, SemanticJar};
use crate::module::ModuleName;
use crate::symbols::{Definition, ImportFromDefinition, SymbolId};
use crate::types::Type;
use crate::FileId;
use ruff_python_ast as ast;
// FIXME: Figure out proper dead-lock free synchronisation now that this takes `&db` instead of `&mut db`.
#[tracing::instrument(level = "trace", skip(db))]
pub fn infer_symbol_type<Db>(db: &Db, file_id: FileId, symbol_id: SymbolId) -> QueryResult<Type>
where
Db: SemanticDb + HasJar<SemanticJar>,
{
let symbols = db.symbol_table(file_id)?;
let defs = symbols.definitions(symbol_id);
if let Some(ty) = db
.jar()?
.type_store
.get_cached_symbol_type(file_id, symbol_id)
{
return Ok(ty);
}
// TODO handle multiple defs, conditional defs...
assert_eq!(defs.len(), 1);
let type_store = &db.jar()?.type_store;
let ty = match &defs[0] {
Definition::ImportFrom(ImportFromDefinition {
module,
name,
level,
}) => {
// TODO relative imports
assert!(matches!(level, 0));
let module_name = ModuleName::new(module.as_ref().expect("TODO relative imports"));
if let Some(module) = db.resolve_module(module_name)? {
let remote_file_id = module.path(db)?.file();
let remote_symbols = db.symbol_table(remote_file_id)?;
if let Some(remote_symbol_id) = remote_symbols.root_symbol_id_by_name(name) {
db.infer_symbol_type(remote_file_id, remote_symbol_id)?
} else {
Type::Unknown
}
} else {
Type::Unknown
}
}
Definition::ClassDef(node_key) => {
if let Some(ty) = type_store.get_cached_node_type(file_id, node_key.erased()) {
ty
} else {
let parsed = db.parse(file_id)?;
let ast = parsed.ast();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
let mut bases = Vec::with_capacity(node.bases().len());
for base in node.bases() {
bases.push(infer_expr_type(db, file_id, base)?);
}
let ty = Type::Class(type_store.add_class(file_id, &node.name.id, bases));
type_store.cache_node_type(file_id, *node_key.erased(), ty);
ty
}
}
Definition::FunctionDef(node_key) => {
if let Some(ty) = type_store.get_cached_node_type(file_id, node_key.erased()) {
ty
} else {
let parsed = db.parse(file_id)?;
let ast = parsed.ast();
let node = node_key
.resolve(ast.as_any_node_ref())
.expect("node key should resolve");
let ty = type_store.add_function(file_id, &node.name.id).into();
type_store.cache_node_type(file_id, *node_key.erased(), ty);
ty
}
}
Definition::Assignment(node_key) => {
let parsed = db.parse(file_id)?;
let ast = parsed.ast();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
// TODO handle unpacking assignment correctly
infer_expr_type(db, file_id, &node.value)?
}
_ => todo!("other kinds of definitions"),
};
type_store.cache_symbol_type(file_id, symbol_id, ty);
// TODO record dependencies
Ok(ty)
}
fn infer_expr_type<Db>(db: &Db, file_id: FileId, expr: &ast::Expr) -> QueryResult<Type>
where
Db: SemanticDb + HasJar<SemanticJar>,
{
// TODO cache the resolution of the type on the node
let symbols = db.symbol_table(file_id)?;
match expr {
ast::Expr::Name(name) => {
if let Some(symbol_id) = symbols.root_symbol_id_by_name(&name.id) {
db.infer_symbol_type(file_id, symbol_id)
} else {
Ok(Type::Unknown)
}
}
_ => todo!("full expression type resolution"),
}
}
#[cfg(test)]
mod tests {
use crate::db::tests::TestDb;
use crate::db::{HasJar, SemanticDb, SemanticJar};
use crate::module::{ModuleName, ModuleSearchPath, ModuleSearchPathKind};
use crate::types::Type;
// TODO with virtual filesystem we shouldn't have to write files to disk for these
// tests
struct TestCase {
temp_dir: tempfile::TempDir,
db: TestDb,
src: ModuleSearchPath,
}
fn create_test() -> std::io::Result<TestCase> {
let temp_dir = tempfile::tempdir()?;
let src = temp_dir.path().join("src");
std::fs::create_dir(&src)?;
let src = ModuleSearchPath::new(src.canonicalize()?, ModuleSearchPathKind::FirstParty);
let roots = vec![src.clone()];
let mut db = TestDb::default();
db.set_module_search_paths(roots);
Ok(TestCase { temp_dir, db, src })
}
#[test]
fn follow_import_to_class() -> anyhow::Result<()> {
let case = create_test()?;
let db = &case.db;
let a_path = case.src.path().join("a.py");
let b_path = case.src.path().join("b.py");
std::fs::write(a_path, "from b import C as D; E = D")?;
std::fs::write(b_path, "class C: pass")?;
let a_file = db
.resolve_module(ModuleName::new("a"))?
.expect("module should be found")
.path(db)?
.file();
let a_syms = db.symbol_table(a_file)?;
let e_sym = a_syms
.root_symbol_id_by_name("E")
.expect("E symbol should be found");
let ty = db.infer_symbol_type(a_file, e_sym)?;
let jar = HasJar::<SemanticJar>::jar(db)?;
assert!(matches!(ty, Type::Class(_)));
assert_eq!(format!("{}", ty.display(&jar.type_store)), "Literal[C]");
Ok(())
}
#[test]
fn resolve_base_class_by_name() -> anyhow::Result<()> {
let case = create_test()?;
let db = &case.db;
let path = case.src.path().join("mod.py");
std::fs::write(path, "class Base: pass\nclass Sub(Base): pass")?;
let file = db
.resolve_module(ModuleName::new("mod"))?
.expect("module should be found")
.path(db)?
.file();
let syms = db.symbol_table(file)?;
let sym = syms
.root_symbol_id_by_name("Sub")
.expect("Sub symbol should be found");
let ty = db.infer_symbol_type(file, sym)?;
let Type::Class(class_id) = ty else {
panic!("Sub is not a Class")
};
let jar = HasJar::<SemanticJar>::jar(db)?;
let base_names: Vec<_> = jar
.type_store
.get_class(class_id)
.bases()
.iter()
.map(|base_ty| format!("{}", base_ty.display(&jar.type_store)))
.collect();
assert_eq!(base_names, vec!["Literal[Base]"]);
Ok(())
}
}

View File

@@ -0,0 +1,78 @@
use anyhow::Context;
use std::path::Path;
use crate::files::Files;
use crate::program::{FileChange, FileChangeKind};
use notify::event::{CreateKind, RemoveKind};
use notify::{recommended_watcher, Event, EventKind, RecommendedWatcher, RecursiveMode, Watcher};
pub struct FileWatcher {
watcher: RecommendedWatcher,
}
pub trait EventHandler: Send + 'static {
fn handle(&self, changes: Vec<FileChange>);
}
impl<F> EventHandler for F
where
F: Fn(Vec<FileChange>) + Send + 'static,
{
fn handle(&self, changes: Vec<FileChange>) {
let f = self;
f(changes);
}
}
impl FileWatcher {
pub fn new<E>(handler: E, files: Files) -> anyhow::Result<Self>
where
E: EventHandler,
{
Self::from_handler(Box::new(handler), files)
}
fn from_handler(handler: Box<dyn EventHandler>, files: Files) -> anyhow::Result<Self> {
let watcher = recommended_watcher(move |changes: notify::Result<Event>| {
match changes {
Ok(event) => {
// TODO verify that this handles all events correctly
let change_kind = match event.kind {
EventKind::Create(CreateKind::File) => FileChangeKind::Created,
EventKind::Modify(_) => FileChangeKind::Modified,
EventKind::Remove(RemoveKind::File) => FileChangeKind::Deleted,
_ => {
return;
}
};
let mut changes = Vec::new();
for path in event.paths {
if path.is_file() {
let id = files.intern(&path);
changes.push(FileChange::new(id, change_kind));
}
}
if !changes.is_empty() {
handler.handle(changes);
}
}
// TODO proper error handling
Err(err) => {
panic!("Error: {err}");
}
}
})
.context("Failed to create file watcher.")?;
Ok(Self { watcher })
}
pub fn watch_folder(&mut self, path: &Path) -> anyhow::Result<()> {
self.watcher.watch(path, RecursiveMode::Recursive)?;
Ok(())
}
}

View File

@@ -1284,3 +1284,49 @@ fn negated_per_file_ignores_overlap() -> Result<()> {
"###);
Ok(())
}
#[test]
fn unused_interaction() -> Result<()> {
let tempdir = TempDir::new()?;
let ruff_toml = tempdir.path().join("ruff.toml");
fs::write(
&ruff_toml,
r#"
[lint]
select = ["F"]
"#,
)?;
insta::with_settings!({
filters => vec![(tempdir_filter(&tempdir).as_str(), "[TMP]/")]
}, {
assert_cmd_snapshot!(Command::new(get_cargo_bin(BIN_NAME))
.args(STDIN_BASE_OPTIONS)
.arg("--config")
.arg(&ruff_toml)
.args(["--stdin-filename", "test.py"])
.arg("--fix")
.arg("-")
.pass_stdin(r#"
import os # F401
def function():
import os # F811
print(os.name)
"#), @r###"
success: true
exit_code: 0
----- stdout -----
import os # F401
def function():
print(os.name)
----- stderr -----
Found 1 error (1 fixed, 0 remaining).
"###);
});
Ok(())
}

View File

@@ -11,9 +11,9 @@ repository = { workspace = true }
license = { workspace = true }
[dependencies]
itertools = { workspace = true }
glob = { workspace = true }
globset = { workspace = true }
itertools = { workspace = true }
regex = { workspace = true }
filetime = { workspace = true }
seahash = { workspace = true }

View File

@@ -115,25 +115,25 @@ class non_keyword_abcmeta_2(abc.ABCMeta): # safe
# very invalid code, but that's up to mypy et al to check
class keyword_abc_1(metaclass=ABC): # safe
class keyword_abc_1(metaclass=ABC): # incorrect but outside scope of this check
def method(self):
foo()
class keyword_abc_2(metaclass=abc.ABC): # safe
class keyword_abc_2(metaclass=abc.ABC): # incorrect but outside scope of this check
def method(self):
foo()
class abc_set_class_variable_1(ABC): # safe
class abc_set_class_variable_1(ABC): # safe (abstract attribute)
foo: int
class abc_set_class_variable_2(ABC): # safe
class abc_set_class_variable_2(ABC): # error (not an abstract attribute)
foo = 2
class abc_set_class_variable_3(ABC): # safe
class abc_set_class_variable_3(ABC): # error (not an abstract attribute)
foo: int = 2

View File

@@ -7,6 +7,9 @@ logging.log(logging.INFO, f"Hello {name}")
_LOGGER = logging.getLogger()
_LOGGER.info(f"{__name__}")
logging.getLogger().info(f"{name}")
from logging import info
info(f"{name}")
info(f"{__name__}")

View File

@@ -53,7 +53,7 @@ class WithinBody[T](list[T]): # OK
def foo(self, x: T) -> T: # OK
return x
def foo(self):
T # OK
@@ -76,6 +76,16 @@ type Foo[T: (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNot
def foo[T: (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
class Foo[T: (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
# Same in defaults
type Foo[T = DoesNotExist] = T # F821: Undefined name `DoesNotExist`
def foo[T = DoesNotExist](t: T) -> T: return t # F821: Undefined name `DoesNotExist`
class Foo[T = DoesNotExist](list[T]): ... # F821: Undefined name `DoesNotExist`
type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
# Type parameters in nested classes
class Parent[T]:
@@ -83,21 +93,21 @@ class Parent[T]:
def can_use_class_variable(self, x: t) -> t: # OK
return x
class Child:
def can_access_parent_type_parameter(self, x: T) -> T: # OK
T # OK
return x
def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
t # F821: Undefined name `t`
return x
# Type parameters in nested functions
def can_access_inside_nested[T](t: T) -> T: # OK
def bar(x: T) -> T: # OK
T # OK
return x
bar(t)

View File

@@ -0,0 +1 @@
#noqa

View File

@@ -28,3 +28,13 @@ class MyClassBase(metaclass=ABCMeta):
@abstractmethod
def example(self, value):
"""Setter."""
class VariadicParameters:
@property
def attribute_var_args(self, *args): # [property-with-parameters]
return sum(args)
@property
def attribute_var_kwargs(self, **kwargs): #[property-with-parameters]
return {key: value * 2 for key, value in kwargs.items()}

View File

@@ -0,0 +1,19 @@
num = 1337
def return_num() -> int:
return num
print(oct(num)[2:]) # FURB116
print(hex(num)[2:]) # FURB116
print(bin(num)[2:]) # FURB116
print(oct(1337)[2:]) # FURB116
print(hex(1337)[2:]) # FURB116
print(bin(1337)[2:]) # FURB116
print(bin(return_num())[2:]) # FURB116 (no autofix)
print(bin(int(f"{num}"))[2:]) # FURB116 (no autofix)
## invalid
print(oct(0o1337)[1:])
print(hex(0x1337)[3:])

View File

@@ -73,3 +73,8 @@ def foo():
async def test():
return [check async for check in async_func()]
async def test() -> str:
vals = [str(val) for val in await async_func(1)]
return ",".join(vals)

View File

@@ -26,10 +26,10 @@ def f() -> None:
# fmt: off
# Invalid - no space before #
d = 1# noqa: E501
d = 1 # noqa: E501
# Invalid - many spaces before #
d = 1 # noqa: E501
d = 1 # noqa: E501
# fmt: on
@@ -104,5 +104,28 @@ def f():
def f():
# Invalid - nonexistant error code with multibyte character
d = 1 #…noqa: F841, E50
e = 1 #…noqa: E50
d = 1 # …noqa: F841, E50
e = 1 # …noqa: E50
def f():
# Disabled - check redirects are reported correctly
eval(command) # noqa: PGH001
# Check duplicate code detection
def f():
x = 2 # noqa: F841, F841, X200
y = 2 == bar # noqa: SIM300, F841, SIM300, SIM300
z = 2 # noqa: F841 F841 F841, F841, F841
return
# Allow code redirects
x = eval(command) # noqa: PGH001, S307
x = eval(command) # noqa: S307, PGH001, S307, S307, S307
x = eval(command) # noqa: PGH001, S307, PGH001
x = eval(command) # noqa: PGH001, S307, PGH001, S307

View File

@@ -0,0 +1,6 @@
x = 2 # noqa: RUF940
x = 2 # noqa: RUF950
x = 2 # noqa: RUF940, RUF950
x = 2 # noqa: RUF950, RUF940, RUF950, RUF950, RUF950
x = 2 # noqa: RUF940, RUF950, RUF940
x = 2 # noqa: RUF940, RUF950, RUF940, RUF950

View File

@@ -128,6 +128,9 @@ pub(crate) fn expression(expr: &Expr, checker: &mut Checker) {
if checker.enabled(Rule::SortedMinMax) {
refurb::rules::sorted_min_max(checker, subscript);
}
if checker.enabled(Rule::FStringNumberFormat) {
refurb::rules::fstring_number_format(checker, subscript);
}
pandas_vet::rules::subscript(checker, value, expr);
}

View File

@@ -31,8 +31,8 @@ use std::path::Path;
use itertools::Itertools;
use log::debug;
use ruff_python_ast::{
self as ast, Comprehension, ElifElseClause, ExceptHandler, Expr, ExprContext, FStringElement,
Keyword, MatchCase, Parameter, ParameterWithDefault, Parameters, Pattern, Stmt, Suite, UnaryOp,
self as ast, AnyParameterRef, Comprehension, ElifElseClause, ExceptHandler, Expr, ExprContext,
FStringElement, Keyword, MatchCase, Parameter, Parameters, Pattern, Stmt, Suite, UnaryOp,
};
use ruff_text_size::{Ranged, TextRange, TextSize};
@@ -464,7 +464,7 @@ impl<'a> Visitor<'a> for Checker<'a> {
let level = *level;
// Mark the top-level module as "seen" by the semantic model.
if level.map_or(true, |level| level == 0) {
if level == 0 {
if let Some(module) = module.and_then(|module| module.split('.').next()) {
self.semantic.add_module(module);
}
@@ -604,15 +604,11 @@ impl<'a> Visitor<'a> for Checker<'a> {
self.visit_type_params(type_params);
}
for parameter_with_default in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
{
if let Some(expr) = &parameter_with_default.parameter.annotation {
if singledispatch {
for parameter in &**parameters {
if let Some(expr) = parameter.annotation() {
if singledispatch && !parameter.is_variadic() {
self.visit_runtime_required_annotation(expr);
singledispatch = false;
} else {
match annotation {
AnnotationContext::RuntimeRequired => {
@@ -625,42 +621,11 @@ impl<'a> Visitor<'a> for Checker<'a> {
self.visit_annotation(expr);
}
}
};
}
}
if let Some(expr) = &parameter_with_default.default {
if let Some(expr) = parameter.default() {
self.visit_expr(expr);
}
singledispatch = false;
}
if let Some(arg) = &parameters.vararg {
if let Some(expr) = &arg.annotation {
match annotation {
AnnotationContext::RuntimeRequired => {
self.visit_runtime_required_annotation(expr);
}
AnnotationContext::RuntimeEvaluated => {
self.visit_runtime_evaluated_annotation(expr);
}
AnnotationContext::TypingOnly => {
self.visit_annotation(expr);
}
}
}
}
if let Some(arg) = &parameters.kwarg {
if let Some(expr) = &arg.annotation {
match annotation {
AnnotationContext::RuntimeRequired => {
self.visit_runtime_required_annotation(expr);
}
AnnotationContext::RuntimeEvaluated => {
self.visit_runtime_evaluated_annotation(expr);
}
AnnotationContext::TypingOnly => {
self.visit_annotation(expr);
}
}
}
}
for expr in returns {
match annotation {
@@ -1043,19 +1008,11 @@ impl<'a> Visitor<'a> for Checker<'a> {
) => {
// Visit the default arguments, but avoid the body, which will be deferred.
if let Some(parameters) = parameters {
for ParameterWithDefault {
default,
parameter: _,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
for default in parameters
.iter_non_variadic_params()
.filter_map(|param| param.default.as_deref())
{
if let Some(expr) = &default {
self.visit_expr(expr);
}
self.visit_expr(default);
}
}
@@ -1483,20 +1440,8 @@ impl<'a> Visitor<'a> for Checker<'a> {
// Step 1: Binding.
// Bind, but intentionally avoid walking default expressions, as we handle them
// upstream.
for parameter_with_default in &parameters.posonlyargs {
self.visit_parameter(&parameter_with_default.parameter);
}
for parameter_with_default in &parameters.args {
self.visit_parameter(&parameter_with_default.parameter);
}
if let Some(arg) = &parameters.vararg {
self.visit_parameter(arg);
}
for parameter_with_default in &parameters.kwonlyargs {
self.visit_parameter(&parameter_with_default.parameter);
}
if let Some(arg) = &parameters.kwarg {
self.visit_parameter(arg);
for parameter in parameters.iter().map(AnyParameterRef::as_parameter) {
self.visit_parameter(parameter);
}
// Step 4: Analysis
@@ -1568,8 +1513,8 @@ impl<'a> Visitor<'a> for Checker<'a> {
// Step 1: Binding
match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar { name, range, .. })
| ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { name, range })
| ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { name, range }) => {
| ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { name, range, .. })
| ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { name, range, .. }) => {
self.add_binding(
name.as_str(),
*range,
@@ -1579,13 +1524,46 @@ impl<'a> Visitor<'a> for Checker<'a> {
}
}
// Step 2: Traversal
if let ast::TypeParam::TypeVar(ast::TypeParamTypeVar {
bound: Some(bound), ..
}) = type_param
{
self.visit
.type_param_definitions
.push((bound, self.semantic.snapshot()));
match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar {
bound,
default,
name: _,
range: _,
}) => {
if let Some(expr) = bound {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
if let Some(expr) = default {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
}
ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple {
default,
name: _,
range: _,
}) => {
if let Some(expr) = default {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
}
ast::TypeParam::ParamSpec(ast::TypeParamParamSpec {
default,
name: _,
range: _,
}) => {
if let Some(expr) = default {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
}
}
}

View File

@@ -39,7 +39,7 @@ fn extract_import_map(path: &Path, package: Option<&Path>, blocks: &[&Block]) ->
level,
range: _,
}) => {
let level = level.unwrap_or_default() as usize;
let level = *level as usize;
let module = if let Some(module) = module {
let module: &String = module.as_ref();
if level == 0 {

View File

@@ -3,18 +3,19 @@
use std::path::Path;
use itertools::Itertools;
use ruff_text_size::Ranged;
use rustc_hash::FxHashSet;
use ruff_diagnostics::{Diagnostic, Edit, Fix};
use ruff_python_trivia::CommentRanges;
use ruff_source_file::Locator;
use ruff_text_size::Ranged;
use crate::fix::edits::delete_comment;
use crate::noqa;
use crate::noqa::{Directive, FileExemption, NoqaDirectives, NoqaMapping};
use crate::noqa::{Code, Directive, FileExemption, NoqaDirectives, NoqaMapping};
use crate::registry::{AsRule, Rule, RuleSet};
use crate::rule_redirects::get_redirect_target;
use crate::rules::pygrep_hooks;
use crate::rules::ruff;
use crate::rules::ruff::rules::{UnusedCodes, UnusedNOQA};
use crate::settings::LinterSettings;
@@ -85,7 +86,7 @@ pub(crate) fn check_noqa(
true
}
Directive::Codes(directive) => {
if noqa::includes(diagnostic.kind.rule(), directive.codes()) {
if directive.includes(diagnostic.kind.rule()) {
directive_line
.matches
.push(diagnostic.kind.rule().noqa_code());
@@ -128,33 +129,37 @@ pub(crate) fn check_noqa(
}
Directive::Codes(directive) => {
let mut disabled_codes = vec![];
let mut duplicated_codes = vec![];
let mut unknown_codes = vec![];
let mut unmatched_codes = vec![];
let mut valid_codes = vec![];
let mut seen_codes = FxHashSet::default();
let mut self_ignore = false;
for code in directive.codes() {
let code = get_redirect_target(code).unwrap_or(code);
for original_code in directive.iter().map(Code::as_str) {
let code = get_redirect_target(original_code).unwrap_or(original_code);
if Rule::UnusedNOQA.noqa_code() == code {
self_ignore = true;
break;
}
if line.matches.iter().any(|match_| *match_ == code)
if !seen_codes.insert(original_code) {
duplicated_codes.push(original_code);
} else if line.matches.iter().any(|match_| *match_ == code)
|| settings
.external
.iter()
.any(|external| code.starts_with(external))
{
valid_codes.push(code);
valid_codes.push(original_code);
} else {
if let Ok(rule) = Rule::from_code(code) {
if settings.rules.enabled(rule) {
unmatched_codes.push(code);
unmatched_codes.push(original_code);
} else {
disabled_codes.push(code);
disabled_codes.push(original_code);
}
} else {
unknown_codes.push(code);
unknown_codes.push(original_code);
}
}
}
@@ -164,6 +169,7 @@ pub(crate) fn check_noqa(
}
if !(disabled_codes.is_empty()
&& duplicated_codes.is_empty()
&& unknown_codes.is_empty()
&& unmatched_codes.is_empty())
{
@@ -174,6 +180,10 @@ pub(crate) fn check_noqa(
.iter()
.map(|code| (*code).to_string())
.collect(),
duplicated: duplicated_codes
.iter()
.map(|code| (*code).to_string())
.collect(),
unknown: unknown_codes
.iter()
.map(|code| (*code).to_string())
@@ -208,6 +218,10 @@ pub(crate) fn check_noqa(
pygrep_hooks::rules::blanket_noqa(diagnostics, &noqa_directives, locator);
}
if settings.rules.enabled(Rule::RedirectedNOQA) {
ruff::rules::redirected_noqa(diagnostics, &noqa_directives);
}
ignored_diagnostics.sort_unstable();
ignored_diagnostics
}

View File

@@ -969,33 +969,34 @@ pub fn code_to_rule(linter: Linter, code: &str) -> Option<(RuleGroup, Rule)> {
(Ruff, "028") => (RuleGroup::Preview, rules::ruff::rules::InvalidFormatterSuppressionComment),
(Ruff, "029") => (RuleGroup::Preview, rules::ruff::rules::UnusedAsync),
(Ruff, "100") => (RuleGroup::Stable, rules::ruff::rules::UnusedNOQA),
(Ruff, "101") => (RuleGroup::Preview, rules::ruff::rules::RedirectedNOQA),
(Ruff, "200") => (RuleGroup::Stable, rules::ruff::rules::InvalidPyprojectToml),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "900") => (RuleGroup::Stable, rules::ruff::rules::StableTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "901") => (RuleGroup::Stable, rules::ruff::rules::StableTestRuleSafeFix),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "902") => (RuleGroup::Stable, rules::ruff::rules::StableTestRuleUnsafeFix),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "903") => (RuleGroup::Stable, rules::ruff::rules::StableTestRuleDisplayOnlyFix),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "911") => (RuleGroup::Preview, rules::ruff::rules::PreviewTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
#[allow(deprecated)]
(Ruff, "912") => (RuleGroup::Nursery, rules::ruff::rules::NurseryTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "920") => (RuleGroup::Deprecated, rules::ruff::rules::DeprecatedTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "921") => (RuleGroup::Deprecated, rules::ruff::rules::AnotherDeprecatedTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "930") => (RuleGroup::Removed, rules::ruff::rules::RemovedTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "931") => (RuleGroup::Removed, rules::ruff::rules::AnotherRemovedTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "940") => (RuleGroup::Removed, rules::ruff::rules::RedirectedFromTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "950") => (RuleGroup::Stable, rules::ruff::rules::RedirectedToTestRule),
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
(Ruff, "960") => (RuleGroup::Removed, rules::ruff::rules::RedirectedFromPrefixTestRule),
@@ -1050,6 +1051,7 @@ pub fn code_to_rule(linter: Linter, code: &str) -> Option<(RuleGroup, Rule)> {
(Refurb, "110") => (RuleGroup::Preview, rules::refurb::rules::IfExpInsteadOfOrOperator),
#[allow(deprecated)]
(Refurb, "113") => (RuleGroup::Nursery, rules::refurb::rules::RepeatedAppend),
(Refurb, "116") => (RuleGroup::Preview, rules::refurb::rules::FStringNumberFormat),
(Refurb, "118") => (RuleGroup::Preview, rules::refurb::rules::ReimplementedOperator),
(Refurb, "129") => (RuleGroup::Preview, rules::refurb::rules::ReadlinesInFor),
#[allow(deprecated)]

View File

@@ -129,21 +129,31 @@ fn apply_fixes<'a>(
/// Compare two fixes.
fn cmp_fix(rule1: Rule, rule2: Rule, fix1: &Fix, fix2: &Fix) -> std::cmp::Ordering {
fix1.min_start()
.cmp(&fix2.min_start())
.then_with(|| match (&rule1, &rule2) {
// Apply `EndsInPeriod` fixes before `NewLineAfterLastParagraph` fixes.
(Rule::EndsInPeriod, Rule::NewLineAfterLastParagraph) => std::cmp::Ordering::Less,
(Rule::NewLineAfterLastParagraph, Rule::EndsInPeriod) => std::cmp::Ordering::Greater,
// Apply `IfElseBlockInsteadOfDictGet` fixes before `IfElseBlockInsteadOfIfExp` fixes.
(Rule::IfElseBlockInsteadOfDictGet, Rule::IfElseBlockInsteadOfIfExp) => {
std::cmp::Ordering::Less
}
(Rule::IfElseBlockInsteadOfIfExp, Rule::IfElseBlockInsteadOfDictGet) => {
std::cmp::Ordering::Greater
}
// Always apply `RedefinedWhileUnused` before `UnusedImport`, as the latter can end up fixing
// the former.
{
match (rule1, rule2) {
(Rule::RedefinedWhileUnused, Rule::UnusedImport) => return std::cmp::Ordering::Less,
(Rule::UnusedImport, Rule::RedefinedWhileUnused) => return std::cmp::Ordering::Greater,
_ => std::cmp::Ordering::Equal,
})
}
}
// Apply fixes in order of their start position.
.then_with(|| fix1.min_start().cmp(&fix2.min_start()))
// Break ties in the event of overlapping rules, for some specific combinations.
.then_with(|| match (&rule1, &rule2) {
// Apply `EndsInPeriod` fixes before `NewLineAfterLastParagraph` fixes.
(Rule::EndsInPeriod, Rule::NewLineAfterLastParagraph) => std::cmp::Ordering::Less,
(Rule::NewLineAfterLastParagraph, Rule::EndsInPeriod) => std::cmp::Ordering::Greater,
// Apply `IfElseBlockInsteadOfDictGet` fixes before `IfElseBlockInsteadOfIfExp` fixes.
(Rule::IfElseBlockInsteadOfDictGet, Rule::IfElseBlockInsteadOfIfExp) => {
std::cmp::Ordering::Less
}
(Rule::IfElseBlockInsteadOfIfExp, Rule::IfElseBlockInsteadOfDictGet) => {
std::cmp::Ordering::Greater
}
_ => std::cmp::Ordering::Equal,
})
}
#[cfg(test)]

View File

@@ -405,7 +405,7 @@ impl<'a> Importer<'a> {
range: _,
}) = stmt
{
if level.map_or(true, |level| level == 0)
if *level == 0
&& name.as_ref().is_some_and(|name| name == module)
&& names.iter().all(|alias| alias.name.as_str() != "*")
{

View File

@@ -33,7 +33,7 @@ use crate::message::Message;
use crate::noqa::add_noqa;
use crate::registry::{AsRule, Rule, RuleSet};
use crate::rules::pycodestyle;
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
use crate::rules::ruff::rules::test_rules::{self, TestRule, TEST_RULES};
use crate::settings::types::UnsafeFixes;
use crate::settings::{flags, LinterSettings};
@@ -218,7 +218,7 @@ pub fn check_path(
}
// Raise violations for internal test rules
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
{
for test_rule in TEST_RULES {
if !settings.rules.enabled(*test_rule) {

View File

@@ -85,8 +85,16 @@ impl<'a> Directive<'a> {
let mut codes_end = codes_start;
let mut leading_space = 0;
while let Some(code) = Self::lex_code(&text[codes_end + leading_space..]) {
codes.push(code);
codes_end += leading_space;
codes.push(Code {
code,
range: TextRange::at(
TextSize::try_from(codes_end).unwrap(),
code.text_len(),
)
.add(offset),
});
codes_end += code.len();
// Codes can be comma- or whitespace-delimited. Compute the length of the
@@ -175,16 +183,51 @@ impl Ranged for All {
}
}
/// An individual rule code in a `noqa` directive (e.g., `F401`).
#[derive(Debug)]
pub(crate) struct Code<'a> {
code: &'a str,
range: TextRange,
}
impl<'a> Code<'a> {
/// The code that is ignored by the `noqa` directive.
pub(crate) fn as_str(&self) -> &'a str {
self.code
}
}
impl Display for Code<'_> {
fn fmt(&self, fmt: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
fmt.write_str(self.code)
}
}
impl<'a> Ranged for Code<'a> {
/// The range of the rule code.
fn range(&self) -> TextRange {
self.range
}
}
#[derive(Debug)]
pub(crate) struct Codes<'a> {
range: TextRange,
codes: Vec<&'a str>,
codes: Vec<Code<'a>>,
}
impl Codes<'_> {
/// The codes that are ignored by the `noqa` directive.
pub(crate) fn codes(&self) -> &[&str] {
&self.codes
impl<'a> Codes<'a> {
/// Returns an iterator over the [`Code`]s in the `noqa` directive.
pub(crate) fn iter(&self) -> std::slice::Iter<Code> {
self.codes.iter()
}
/// Returns `true` if the string list of `codes` includes `code` (or an alias
/// thereof).
pub(crate) fn includes(&self, needle: Rule) -> bool {
let needle = needle.noqa_code();
self.iter()
.any(|code| needle == get_redirect_target(code.as_str()).unwrap_or(code.as_str()))
}
}
@@ -195,15 +238,6 @@ impl Ranged for Codes<'_> {
}
}
/// Returns `true` if the string list of `codes` includes `code` (or an alias
/// thereof).
pub(crate) fn includes(needle: Rule, haystack: &[&str]) -> bool {
let needle = needle.noqa_code();
haystack
.iter()
.any(|candidate| needle == get_redirect_target(candidate).unwrap_or(candidate))
}
/// Returns `true` if the given [`Rule`] is ignored at the specified `lineno`.
pub(crate) fn rule_is_ignored(
code: Rule,
@@ -215,7 +249,7 @@ pub(crate) fn rule_is_ignored(
let line_range = locator.line_range(offset);
match Directive::try_extract(locator.slice(line_range), line_range.start()) {
Ok(Some(Directive::All(_))) => true,
Ok(Some(Directive::Codes(Codes { codes, range: _ }))) => includes(code, &codes),
Ok(Some(Directive::Codes(codes))) => codes.includes(code),
_ => false,
}
}
@@ -525,8 +559,8 @@ fn add_noqa_inner(
Directive::All(_) => {
continue;
}
Directive::Codes(Codes { codes, range: _ }) => {
if includes(diagnostic.kind.rule(), codes) {
Directive::Codes(codes) => {
if codes.includes(diagnostic.kind.rule()) {
continue;
}
}
@@ -542,9 +576,9 @@ fn add_noqa_inner(
Directive::All(_) => {
continue;
}
Directive::Codes(Codes { codes, range: _ }) => {
Directive::Codes(codes) => {
let rule = diagnostic.kind.rule();
if !includes(rule, codes) {
if !codes.includes(rule) {
matches_by_line
.entry(directive_line.start())
.or_insert_with(|| {
@@ -591,7 +625,7 @@ fn add_noqa_inner(
Some(Directive::All(_)) => {
// Does not get inserted into the map.
}
Some(Directive::Codes(Codes { range, codes })) => {
Some(Directive::Codes(codes)) => {
// Reconstruct the line based on the preserved rule codes.
// This enables us to tally the number of edits.
let output_start = output.len();
@@ -599,7 +633,7 @@ fn add_noqa_inner(
// Add existing content.
output.push_str(
locator
.slice(TextRange::new(offset, range.start()))
.slice(TextRange::new(offset, codes.start()))
.trim_end(),
);
@@ -652,6 +686,8 @@ pub(crate) struct NoqaDirectiveLine<'a> {
pub(crate) directive: Directive<'a>,
/// The codes that are ignored by the directive.
pub(crate) matches: Vec<NoqaCode>,
// Whether the directive applies to range.end
pub(crate) includes_end: bool,
}
impl Ranged for NoqaDirectiveLine<'_> {
@@ -684,23 +720,18 @@ impl<'a> NoqaDirectives<'a> {
}
Ok(Some(directive)) => {
// noqa comments are guaranteed to be single line.
let range = locator.line_range(range.start());
directives.push(NoqaDirectiveLine {
range: locator.line_range(range.start()),
range,
directive,
matches: Vec::new(),
includes_end: range.end() == locator.contents().text_len(),
});
}
Ok(None) => {}
}
}
// Extend a mapping at the end of the file to also include the EOF token.
if let Some(last) = directives.last_mut() {
if last.range.end() == locator.contents().text_len() {
last.range = last.range.add_end(TextSize::from(1));
}
}
Self { inner: directives }
}
@@ -724,11 +755,15 @@ impl<'a> NoqaDirectives<'a> {
.binary_search_by(|directive| {
if directive.range.end() < offset {
std::cmp::Ordering::Less
} else if directive.range.contains(offset) {
std::cmp::Ordering::Equal
} else {
} else if directive.range.start() > offset {
std::cmp::Ordering::Greater
}
// At this point, end >= offset, start <= offset
else if !directive.includes_end && directive.range.end() == offset {
std::cmp::Ordering::Less
} else {
std::cmp::Ordering::Equal
}
})
.ok()
}

View File

@@ -246,7 +246,7 @@ impl Rule {
pub const fn lint_source(&self) -> LintSource {
match self {
Rule::InvalidPyprojectToml => LintSource::PyprojectToml,
Rule::BlanketNOQA | Rule::UnusedNOQA => LintSource::Noqa,
Rule::BlanketNOQA | Rule::RedirectedNOQA | Rule::UnusedNOQA => LintSource::Noqa,
Rule::BidirectionalUnicode
| Rule::BlankLineWithWhitespace
| Rule::DocLineTooLong

View File

@@ -104,10 +104,10 @@ static REDIRECTS: Lazy<HashMap<&'static str, &'static str>> = Lazy::new(|| {
("PGH001", "S307"),
("PGH002", "G010"),
// Test redirect by exact code
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
("RUF940", "RUF950"),
// Test redirect by prefix
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
("RUF96", "RUF95"),
// See: https://github.com/astral-sh/ruff/issues/10791
("PLW0117", "PLW0177"),

View File

@@ -323,7 +323,7 @@ mod schema {
})
.filter(|_rule| {
// Filter out all test-only rules
#[cfg(feature = "test-rules")]
#[cfg(any(feature = "test-rules", test))]
#[allow(clippy::used_underscore_binding)]
if _rule.starts_with("RUF9") {
return false;

View File

@@ -582,6 +582,7 @@ fn is_stub_function(function_def: &ast::StmtFunctionDef, checker: &Checker) -> b
}
/// Generate flake8-annotation checks for a given `Definition`.
/// ANN001, ANN401
pub(crate) fn definition(
checker: &Checker,
definition: &Definition,
@@ -615,23 +616,14 @@ pub(crate) fn definition(
let is_overridden = visibility::is_override(decorator_list, checker.semantic());
// ANN001, ANN401
// If this is a non-static method, skip `cls` or `self`.
for ParameterWithDefault {
parameter,
default: _,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.skip(
// If this is a non-static method, skip `cls` or `self`.
usize::from(
is_method && !visibility::is_staticmethod(decorator_list, checker.semantic()),
),
)
{
} in parameters.iter_non_variadic_params().skip(usize::from(
is_method && !visibility::is_staticmethod(decorator_list, checker.semantic()),
)) {
// ANN401 for dynamically typed parameters
if let Some(annotation) = &parameter.annotation {
has_any_typed_arg = true;

View File

@@ -74,11 +74,7 @@ pub(crate) fn hardcoded_password_default(checker: &mut Checker, parameters: &Par
parameter,
default,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
} in parameters.iter_non_variadic_params()
{
let Some(default) = default else {
continue;

View File

@@ -49,44 +49,35 @@ impl Violation for SslWithBadDefaults {
/// S503
pub(crate) fn ssl_with_bad_defaults(checker: &mut Checker, function_def: &StmtFunctionDef) {
function_def
for default in function_def
.parameters
.posonlyargs
.iter()
.chain(
function_def
.parameters
.args
.iter()
.chain(function_def.parameters.kwonlyargs.iter()),
)
.for_each(|param| {
if let Some(default) = &param.default {
match default.as_ref() {
Expr::Name(ast::ExprName { id, range, .. }) => {
if is_insecure_protocol(id.as_str()) {
checker.diagnostics.push(Diagnostic::new(
SslWithBadDefaults {
protocol: id.to_string(),
},
*range,
));
}
}
Expr::Attribute(ast::ExprAttribute { attr, range, .. }) => {
if is_insecure_protocol(attr.as_str()) {
checker.diagnostics.push(Diagnostic::new(
SslWithBadDefaults {
protocol: attr.to_string(),
},
*range,
));
}
}
_ => {}
.iter_non_variadic_params()
.filter_map(|param| param.default.as_deref())
{
match default {
Expr::Name(ast::ExprName { id, range, .. }) => {
if is_insecure_protocol(id.as_str()) {
checker.diagnostics.push(Diagnostic::new(
SslWithBadDefaults {
protocol: id.to_string(),
},
*range,
));
}
}
});
Expr::Attribute(ast::ExprAttribute { attr, range, .. }) => {
if is_insecure_protocol(attr.as_str()) {
checker.diagnostics.push(Diagnostic::new(
SslWithBadDefaults {
protocol: attr.to_string(),
},
*range,
));
}
}
_ => {}
}
}
}
/// Returns `true` if the given protocol name is insecure.

View File

@@ -56,6 +56,7 @@ impl Violation for AbstractBaseClassWithoutAbstractMethod {
format!("`{name}` is an abstract base class, but it has no abstract methods")
}
}
/// ## What it does
/// Checks for empty methods in abstract base classes without an abstract
/// decorator.
@@ -156,8 +157,13 @@ pub(crate) fn abstract_base_class(
let mut has_abstract_method = false;
for stmt in body {
// https://github.com/PyCQA/flake8-bugbear/issues/293
// Ignore abc's that declares a class attribute that must be set
if let Stmt::AnnAssign(_) | Stmt::Assign(_) = stmt {
// If an ABC declares an attribute by providing a type annotation
// but does not actually assign a value for that attribute,
// assume it is intended to be an "abstract attribute"
if matches!(
stmt,
Stmt::AnnAssign(ast::StmtAnnAssign { value: None, .. })
) {
has_abstract_method = true;
continue;
}

View File

@@ -139,11 +139,7 @@ pub(crate) fn function_call_in_argument_default(checker: &mut Checker, parameter
default,
parameter,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
} in parameters.iter_non_variadic_params()
{
if let Some(expr) = &default {
if !parameter.annotation.as_ref().is_some_and(|expr| {

View File

@@ -105,11 +105,7 @@ impl<'a> Visitor<'a> for NameFinder<'a> {
parameter,
default: _,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
} in parameters.iter_non_variadic_params()
{
self.names.remove(parameter.name.as_str());
}

View File

@@ -89,12 +89,7 @@ pub(crate) fn mutable_argument_default(checker: &mut Checker, function_def: &ast
parameter,
default,
range: _,
} in function_def
.parameters
.posonlyargs
.iter()
.chain(&function_def.parameters.args)
.chain(&function_def.parameters.kwonlyargs)
} in function_def.parameters.iter_non_variadic_params()
{
let Some(default) = default else {
continue;

View File

@@ -41,6 +41,20 @@ B024.py:92:7: B024 `notabc_Base_1` is an abstract base class, but it has no abst
94 | foo()
|
B024.py:132:7: B024 `abc_set_class_variable_2` is an abstract base class, but it has no abstract methods
|
132 | class abc_set_class_variable_2(ABC): # error (not an abstract attribute)
| ^^^^^^^^^^^^^^^^^^^^^^^^ B024
133 | foo = 2
|
B024.py:136:7: B024 `abc_set_class_variable_3` is an abstract base class, but it has no abstract methods
|
136 | class abc_set_class_variable_3(ABC): # error (not an abstract attribute)
| ^^^^^^^^^^^^^^^^^^^^^^^^ B024
137 | foo: int = 2
|
B024.py:141:7: B024 `abc_set_class_variable_4` is an abstract base class, but it has no abstract methods
|
140 | # this doesn't actually declare a class variable, it's just an expression
@@ -48,5 +62,3 @@ B024.py:141:7: B024 `abc_set_class_variable_4` is an abstract base class, but it
| ^^^^^^^^^^^^^^^^^^^^^^^^ B024
142 | foo
|

View File

@@ -107,10 +107,7 @@ pub(crate) fn unnecessary_map(
if parameters.as_ref().is_some_and(|parameters| {
late_binding(parameters, body)
|| parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.iter_non_variadic_params()
.any(|param| param.default.is_some())
|| parameters.vararg.is_some()
|| parameters.kwarg.is_some()
@@ -152,10 +149,7 @@ pub(crate) fn unnecessary_map(
if parameters.as_ref().is_some_and(|parameters| {
late_binding(parameters, body)
|| parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.iter_non_variadic_params()
.any(|param| param.default.is_some())
|| parameters.vararg.is_some()
|| parameters.kwarg.is_some()
@@ -207,10 +201,7 @@ pub(crate) fn unnecessary_map(
if parameters.as_ref().is_some_and(|parameters| {
late_binding(parameters, body)
|| parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.iter_non_variadic_params()
.any(|param| param.default.is_some())
|| parameters.vararg.is_some()
|| parameters.kwarg.is_some()

View File

@@ -25,23 +25,31 @@ G004.py:8:14: G004 Logging statement uses f-string
8 | _LOGGER.info(f"{__name__}")
| ^^^^^^^^^^^^^ G004
9 |
10 | from logging import info
10 | logging.getLogger().info(f"{name}")
|
G004.py:11:6: G004 Logging statement uses f-string
G004.py:10:26: G004 Logging statement uses f-string
|
10 | from logging import info
11 | info(f"{name}")
8 | _LOGGER.info(f"{__name__}")
9 |
10 | logging.getLogger().info(f"{name}")
| ^^^^^^^^^ G004
11 |
12 | from logging import info
|
G004.py:14:6: G004 Logging statement uses f-string
|
12 | from logging import info
13 |
14 | info(f"{name}")
| ^^^^^^^^^ G004
12 | info(f"{__name__}")
15 | info(f"{__name__}")
|
G004.py:12:6: G004 Logging statement uses f-string
G004.py:15:6: G004 Logging statement uses f-string
|
10 | from logging import info
11 | info(f"{name}")
12 | info(f"{__name__}")
14 | info(f"{name}")
15 | info(f"{__name__}")
| ^^^^^^^^^^^^^ G004
|

View File

@@ -2,7 +2,7 @@ use std::fmt;
use ruff_diagnostics::{Diagnostic, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{Expr, Parameters};
use ruff_python_ast::{AnyParameterRef, Parameters};
use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;
@@ -58,43 +58,21 @@ pub(crate) fn no_return_argument_annotation(checker: &mut Checker, parameters: &
// Ex) def func(arg: NoReturn): ...
// Ex) def func(arg: NoReturn, /): ...
// Ex) def func(*, arg: NoReturn): ...
for annotation in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.filter_map(|arg| arg.parameter.annotation.as_ref())
{
check_no_return_argument_annotation(checker, annotation);
}
// Ex) def func(*args: NoReturn): ...
if let Some(arg) = &parameters.vararg {
if let Some(annotation) = &arg.annotation {
check_no_return_argument_annotation(checker, annotation);
}
}
// Ex) def func(**kwargs: NoReturn): ...
if let Some(arg) = &parameters.kwarg {
if let Some(annotation) = &arg.annotation {
check_no_return_argument_annotation(checker, annotation);
}
}
}
fn check_no_return_argument_annotation(checker: &mut Checker, annotation: &Expr) {
if checker.semantic().match_typing_expr(annotation, "NoReturn") {
checker.diagnostics.push(Diagnostic::new(
NoReturnArgumentAnnotationInStub {
module: if checker.settings.target_version >= Py311 {
TypingModule::Typing
} else {
TypingModule::TypingExtensions
for annotation in parameters.iter().filter_map(AnyParameterRef::annotation) {
if checker.semantic().match_typing_expr(annotation, "NoReturn") {
checker.diagnostics.push(Diagnostic::new(
NoReturnArgumentAnnotationInStub {
module: if checker.settings.target_version >= Py311 {
TypingModule::Typing
} else {
TypingModule::TypingExtensions
},
},
},
annotation.range(),
));
annotation.range(),
));
}
}
}

View File

@@ -1,6 +1,6 @@
use ruff_diagnostics::{Diagnostic, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{Expr, Parameters};
use ruff_python_ast::{AnyParameterRef, Expr, Parameters};
use ruff_python_semantic::analyze::typing::traverse_union;
use ruff_text_size::Ranged;
@@ -61,25 +61,7 @@ impl Violation for RedundantNumericUnion {
/// PYI041
pub(crate) fn redundant_numeric_union(checker: &mut Checker, parameters: &Parameters) {
for annotation in parameters
.args
.iter()
.chain(parameters.posonlyargs.iter())
.chain(parameters.kwonlyargs.iter())
.filter_map(|arg| arg.parameter.annotation.as_ref())
{
check_annotation(checker, annotation);
}
// If annotations on `args` or `kwargs` are flagged by this rule, the annotations themselves
// are not accurate, but check them anyway. It's possible that flagging them will help the user
// realize they're incorrect.
for annotation in parameters
.vararg
.iter()
.chain(parameters.kwarg.iter())
.filter_map(|arg| arg.annotation.as_ref())
{
for annotation in parameters.iter().filter_map(AnyParameterRef::annotation) {
check_annotation(checker, annotation);
}
}

View File

@@ -495,11 +495,7 @@ pub(crate) fn typed_argument_simple_defaults(checker: &mut Checker, parameters:
parameter,
default,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
} in parameters.iter_non_variadic_params()
{
let Some(default) = default else {
continue;
@@ -530,11 +526,7 @@ pub(crate) fn argument_simple_defaults(checker: &mut Checker, parameters: &Param
parameter,
default,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
} in parameters.iter_non_variadic_params()
{
let Some(default) = default else {
continue;

View File

@@ -8,7 +8,7 @@ use ruff_python_ast::name::UnqualifiedName;
use ruff_python_ast::visitor;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::Decorator;
use ruff_python_ast::{self as ast, Expr, ParameterWithDefault, Parameters, Stmt};
use ruff_python_ast::{self as ast, Expr, Parameters, Stmt};
use ruff_python_semantic::analyze::visibility::is_abstract;
use ruff_python_semantic::SemanticModel;
use ruff_text_size::Ranged;
@@ -841,28 +841,17 @@ fn check_fixture_returns(
/// PT019
fn check_test_function_args(checker: &mut Checker, parameters: &Parameters) {
parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.for_each(
|ParameterWithDefault {
parameter,
default: _,
range: _,
}| {
let name = &parameter.name;
if name.starts_with('_') {
checker.diagnostics.push(Diagnostic::new(
PytestFixtureParamWithoutValue {
name: name.to_string(),
},
parameter.range(),
));
}
},
);
for parameter in parameters.iter_non_variadic_params() {
let name = &parameter.parameter.name;
if name.starts_with('_') {
checker.diagnostics.push(Diagnostic::new(
PytestFixtureParamWithoutValue {
name: name.to_string(),
},
parameter.range(),
));
}
}
}
/// PT020

View File

@@ -54,13 +54,11 @@ pub(crate) fn import(import_from: &Stmt, name: &str, asname: Option<&str>) -> Op
pub(crate) fn import_from(
import_from: &Stmt,
module: Option<&str>,
level: Option<u32>,
level: u32,
) -> Option<Diagnostic> {
// If level is not zero or module is none, return
if let Some(level) = level {
if level != 0 {
return None;
}
if level != 0 {
return None;
};
if let Some(module) = module {

View File

@@ -76,7 +76,7 @@ impl Violation for RelativeImports {
fn fix_banned_relative_import(
stmt: &Stmt,
level: Option<u32>,
level: u32,
module: Option<&str>,
module_path: Option<&[String]>,
generator: Generator,
@@ -99,7 +99,7 @@ fn fix_banned_relative_import(
TextRange::default(),
)),
names: names.clone(),
level: Some(0),
level: 0,
range: TextRange::default(),
};
let content = generator.stmt(&node.into());
@@ -113,7 +113,7 @@ fn fix_banned_relative_import(
pub(crate) fn banned_relative_import(
checker: &Checker,
stmt: &Stmt,
level: Option<u32>,
level: u32,
module: Option<&str>,
module_path: Option<&[String]>,
strictness: Strictness,
@@ -122,7 +122,7 @@ pub(crate) fn banned_relative_import(
Strictness::All => 0,
Strictness::Parents => 1,
};
if level? > strictness_level {
if level > strictness_level {
let mut diagnostic = Diagnostic::new(RelativeImports { strictness }, stmt.range());
if let Some(fix) =
fix_banned_relative_import(stmt, level, module, module_path, checker.generator())

View File

@@ -300,7 +300,7 @@ pub(crate) fn typing_only_runtime_import(
// Categorize the import, using coarse-grained categorization.
let import_type = match categorize(
&qualified_name.to_string(),
None,
0,
&checker.settings.src,
checker.package(),
checker.settings.isort.detect_same_package,

View File

@@ -1,5 +1,3 @@
use std::iter;
use regex::Regex;
use ruff_python_ast as ast;
use ruff_python_ast::{Parameter, Parameters};
@@ -224,19 +222,20 @@ fn function(
diagnostics: &mut Vec<Diagnostic>,
) {
let args = parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.iter_non_variadic_params()
.map(|parameter_with_default| &parameter_with_default.parameter)
.chain(
iter::once::<Option<&Parameter>>(parameters.vararg.as_deref())
.flatten()
parameters
.vararg
.as_deref()
.into_iter()
.skip(usize::from(ignore_variadic_names)),
)
.chain(
iter::once::<Option<&Parameter>>(parameters.kwarg.as_deref())
.flatten()
parameters
.kwarg
.as_deref()
.into_iter()
.skip(usize::from(ignore_variadic_names)),
);
call(
@@ -260,20 +259,21 @@ fn method(
diagnostics: &mut Vec<Diagnostic>,
) {
let args = parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.iter_non_variadic_params()
.skip(1)
.map(|parameter_with_default| &parameter_with_default.parameter)
.chain(
iter::once::<Option<&Parameter>>(parameters.vararg.as_deref())
.flatten()
parameters
.vararg
.as_deref()
.into_iter()
.skip(usize::from(ignore_variadic_names)),
)
.chain(
iter::once::<Option<&Parameter>>(parameters.kwarg.as_deref())
.flatten()
parameters
.kwarg
.as_deref()
.into_iter()
.skip(usize::from(ignore_variadic_names)),
);
call(

View File

@@ -92,7 +92,7 @@ enum Reason<'a> {
#[allow(clippy::too_many_arguments)]
pub(crate) fn categorize<'a>(
module_name: &str,
level: Option<u32>,
level: u32,
src: &[PathBuf],
package: Option<&Path>,
detect_same_package: bool,
@@ -104,14 +104,14 @@ pub(crate) fn categorize<'a>(
) -> &'a ImportSection {
let module_base = module_name.split('.').next().unwrap();
let (mut import_type, mut reason) = {
if matches!(level, None | Some(0)) && module_base == "__future__" {
if level == 0 && module_base == "__future__" {
(&ImportSection::Known(ImportType::Future), Reason::Future)
} else if no_sections {
(
&ImportSection::Known(ImportType::FirstParty),
Reason::NoSections,
)
} else if level.is_some_and(|level| level > 0) {
} else if level > 0 {
(
&ImportSection::Known(ImportType::LocalFolder),
Reason::NonZeroLevel,
@@ -133,7 +133,7 @@ pub(crate) fn categorize<'a>(
&ImportSection::Known(ImportType::FirstParty),
Reason::SourceMatch(src),
)
} else if matches!(level, None | Some(0)) && module_name == "__main__" {
} else if level == 0 && module_name == "__main__" {
(
&ImportSection::Known(ImportType::FirstParty),
Reason::KnownFirstParty,
@@ -191,7 +191,7 @@ pub(crate) fn categorize_imports<'a>(
for (alias, comments) in block.import {
let import_type = categorize(
&alias.module_name(),
None,
0,
src,
package,
detect_same_package,

View File

@@ -52,7 +52,7 @@ pub(crate) enum AnnotatedImport<'a> {
ImportFrom {
module: Option<&'a str>,
names: Vec<AnnotatedAliasData<'a>>,
level: Option<u32>,
level: u32,
atop: Vec<Comment<'a>>,
inline: Vec<Comment<'a>>,
trailing_comma: TrailingComma,

View File

@@ -73,7 +73,7 @@ pub(crate) fn order_imports<'a>(
ModuleKey::from_module(
Some(alias.name),
alias.asname,
None,
0,
None,
ImportStyle::Straight,
settings,
@@ -90,7 +90,7 @@ pub(crate) fn order_imports<'a>(
Import((alias, _)) => ModuleKey::from_module(
Some(alias.name),
alias.asname,
None,
0,
None,
ImportStyle::Straight,
settings,
@@ -110,7 +110,7 @@ pub(crate) fn order_imports<'a>(
ModuleKey::from_module(
Some(alias.name),
alias.asname,
None,
0,
None,
ImportStyle::Straight,
settings,

View File

@@ -97,7 +97,7 @@ impl<'a> ModuleKey<'a> {
pub(crate) fn from_module(
name: Option<&'a str>,
asname: Option<&'a str>,
level: Option<u32>,
level: u32,
first_alias: Option<(&'a str, Option<&'a str>)>,
style: ImportStyle,
settings: &Settings,
@@ -106,13 +106,11 @@ impl<'a> ModuleKey<'a> {
let maybe_length = (settings.length_sort
|| (settings.length_sort_straight && style == ImportStyle::Straight))
.then_some(
name.map(str::width).unwrap_or_default() + level.unwrap_or_default() as usize,
);
.then_some(name.map(str::width).unwrap_or_default() + level as usize);
let distance = match level {
None | Some(0) => Distance::None,
Some(level) => match settings.relative_imports_order {
0 => Distance::None,
_ => match settings.relative_imports_order {
RelativeImportsOrder::ClosestToFurthest => Distance::Nearest(level),
RelativeImportsOrder::FurthestToClosest => Distance::Furthest(Reverse(level)),
},

View File

@@ -14,7 +14,7 @@ pub(crate) enum TrailingComma {
#[derive(Debug, Hash, Ord, PartialOrd, Eq, PartialEq, Clone)]
pub(crate) struct ImportFromData<'a> {
pub(crate) module: Option<&'a str>,
pub(crate) level: Option<u32>,
pub(crate) level: u32,
}
#[derive(Debug, Hash, Ord, PartialOrd, Eq, PartialEq)]

View File

@@ -257,15 +257,9 @@ fn rename_parameter(
) -> Result<Option<Fix>> {
// Don't fix if another parameter has the valid name.
if parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.skip(1)
.map(|parameter_with_default| &parameter_with_default.parameter)
.chain(parameters.vararg.as_deref())
.chain(parameters.kwarg.as_deref())
.any(|parameter| &parameter.name == function_type.valid_first_argument_name())
.any(|parameter| parameter.name() == function_type.valid_first_argument_name())
{
return Ok(None);
}

View File

@@ -196,13 +196,11 @@ fn function(
value: Some(Box::new(body.clone())),
range: TextRange::default(),
});
let parameters = parameters
.cloned()
.unwrap_or_else(|| Parameters::empty(TextRange::default()));
let parameters = parameters.cloned().unwrap_or_default();
if let Some(annotation) = annotation {
if let Some((arg_types, return_type)) = extract_types(annotation, semantic) {
// A `lambda` expression can only have positional and positional-only
// arguments. The order is always positional-only first, then positional.
// A `lambda` expression can only have positional-only and positional-or-keyword
// arguments. The order is always positional-only first, then positional-or-keyword.
let new_posonlyargs = parameters
.posonlyargs
.iter()

View File

@@ -1755,23 +1755,19 @@ fn missing_args(checker: &mut Checker, docstring: &Docstring, docstrings_args: &
// Look for arguments that weren't included in the docstring.
let mut missing_arg_names: FxHashSet<String> = FxHashSet::default();
// If this is a non-static method, skip `cls` or `self`.
for ParameterWithDefault {
parameter,
default: _,
range: _,
} in function
.parameters
.posonlyargs
.iter()
.chain(&function.parameters.args)
.chain(&function.parameters.kwonlyargs)
.skip(
// If this is a non-static method, skip `cls` or `self`.
usize::from(
docstring.definition.is_method()
&& !is_staticmethod(&function.decorator_list, checker.semantic()),
),
)
.iter_non_variadic_params()
.skip(usize::from(
docstring.definition.is_method()
&& !is_staticmethod(&function.decorator_list, checker.semantic()),
))
{
let arg_name = parameter.name.as_str();
if !arg_name.starts_with('_') && !docstrings_args.contains(arg_name) {

View File

@@ -10,8 +10,9 @@ use crate::checkers::ast::Checker;
///
/// ## Why is this bad?
/// In Python 2, the `print` statement can be used with the `>>` syntax to
/// print to a file-like object. This `print >> sys.stderr` syntax is
/// deprecated in Python 3.
/// print to a file-like object. This `print >> sys.stderr` syntax no
/// longer exists in Python 3, where `print` is only a function, not a
/// statement.
///
/// Instead, use the `file` keyword argument to the `print` function, the
/// `sys.stderr.write` function, or the `logging` module.

View File

@@ -134,7 +134,7 @@ F821_17.py:77:15: F821 Undefined name `DoesNotExist1`
77 | class Foo[T: (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
78 |
79 | # Type parameters in nested classes
79 | # Same in defaults
|
F821_17.py:77:30: F821 Undefined name `DoesNotExist2`
@@ -144,35 +144,117 @@ F821_17.py:77:30: F821 Undefined name `DoesNotExist2`
77 | class Foo[T: (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
78 |
79 | # Type parameters in nested classes
79 | # Same in defaults
|
F821_17.py:92:52: F821 Undefined name `t`
F821_17.py:81:14: F821 Undefined name `DoesNotExist`
|
90 | return x
91 |
92 | def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
| ^ F821
93 | t # F821: Undefined name `t`
94 | return x
79 | # Same in defaults
80 |
81 | type Foo[T = DoesNotExist] = T # F821: Undefined name `DoesNotExist`
| ^^^^^^^^^^^^ F821
82 | def foo[T = DoesNotExist](t: T) -> T: return t # F821: Undefined name `DoesNotExist`
83 | class Foo[T = DoesNotExist](list[T]): ... # F821: Undefined name `DoesNotExist`
|
F821_17.py:92:58: F821 Undefined name `t`
F821_17.py:82:13: F821 Undefined name `DoesNotExist`
|
90 | return x
91 |
92 | def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
| ^ F821
93 | t # F821: Undefined name `t`
94 | return x
81 | type Foo[T = DoesNotExist] = T # F821: Undefined name `DoesNotExist`
82 | def foo[T = DoesNotExist](t: T) -> T: return t # F821: Undefined name `DoesNotExist`
| ^^^^^^^^^^^^ F821
83 | class Foo[T = DoesNotExist](list[T]): ... # F821: Undefined name `DoesNotExist`
|
F821_17.py:93:17: F821 Undefined name `t`
F821_17.py:83:15: F821 Undefined name `DoesNotExist`
|
92 | def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
93 | t # F821: Undefined name `t`
| ^ F821
94 | return x
81 | type Foo[T = DoesNotExist] = T # F821: Undefined name `DoesNotExist`
82 | def foo[T = DoesNotExist](t: T) -> T: return t # F821: Undefined name `DoesNotExist`
83 | class Foo[T = DoesNotExist](list[T]): ... # F821: Undefined name `DoesNotExist`
| ^^^^^^^^^^^^ F821
84 |
85 | type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
|
F821_17.py:85:15: F821 Undefined name `DoesNotExist1`
|
83 | class Foo[T = DoesNotExist](list[T]): ... # F821: Undefined name `DoesNotExist`
84 |
85 | type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
86 | def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
87 | class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
|
F821_17.py:85:30: F821 Undefined name `DoesNotExist2`
|
83 | class Foo[T = DoesNotExist](list[T]): ... # F821: Undefined name `DoesNotExist`
84 |
85 | type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
86 | def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
87 | class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
|
F821_17.py:86:14: F821 Undefined name `DoesNotExist1`
|
85 | type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
86 | def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
87 | class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
|
F821_17.py:86:29: F821 Undefined name `DoesNotExist2`
|
85 | type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
86 | def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
87 | class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
|
F821_17.py:87:16: F821 Undefined name `DoesNotExist1`
|
85 | type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
86 | def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
87 | class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
88 |
89 | # Type parameters in nested classes
|
F821_17.py:87:31: F821 Undefined name `DoesNotExist2`
|
85 | type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
86 | def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
87 | class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
| ^^^^^^^^^^^^^ F821
88 |
89 | # Type parameters in nested classes
|
F821_17.py:102:52: F821 Undefined name `t`
|
100 | return x
101 |
102 | def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
| ^ F821
103 | t # F821: Undefined name `t`
104 | return x
|
F821_17.py:102:58: F821 Undefined name `t`
|
100 | return x
101 |
102 | def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
| ^ F821
103 | t # F821: Undefined name `t`
104 | return x
|
F821_17.py:103:17: F821 Undefined name `t`
|
102 | def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
103 | t # F821: Undefined name `t`
| ^ F821
104 | return x
|

View File

@@ -16,6 +16,7 @@ mod tests {
#[test_case(Rule::BlanketTypeIgnore, Path::new("PGH003_0.py"))]
#[test_case(Rule::BlanketTypeIgnore, Path::new("PGH003_1.py"))]
#[test_case(Rule::BlanketNOQA, Path::new("PGH004_0.py"))]
#[test_case(Rule::BlanketNOQA, Path::new("PGH004_1.py"))]
#[test_case(Rule::InvalidMockAccess, Path::new("PGH005_0.py"))]
fn rules(rule_code: Rule, path: &Path) -> Result<()> {
let snapshot = format!("{}_{}", rule_code.noqa_code(), path.to_string_lossy());

View File

@@ -88,9 +88,8 @@ pub(crate) fn blanket_noqa(
) {
for directive_line in noqa_directives.lines() {
if let Directive::All(all) = &directive_line.directive {
let line = locator.slice(directive_line.range);
let offset = directive_line.range.start();
let noqa_end = all.end() - offset;
let line = locator.slice(directive_line);
let noqa_end = all.end() - directive_line.start();
// Skip the `# noqa`, plus any trailing whitespace.
let mut cursor = Cursor::new(&line[noqa_end.to_usize()..]);

View File

@@ -0,0 +1,8 @@
---
source: crates/ruff_linter/src/rules/pygrep_hooks/mod.rs
---
PGH004_1.py:1:1: PGH004 Use specific rule codes when using `noqa`
|
1 | #noqa
| ^^^^^ PGH004
|

View File

@@ -51,7 +51,7 @@ pub(crate) fn import_self(alias: &Alias, module_path: Option<&[String]>) -> Opti
/// PLW0406
pub(crate) fn import_from_self(
level: Option<u32>,
level: u32,
module: Option<&str>,
names: &[Alias],
module_path: Option<&[String]>,

View File

@@ -78,7 +78,7 @@ pub(crate) fn manual_from_import(
asname: None,
range: TextRange::default(),
}],
level: Some(0),
level: 0,
range: TextRange::default(),
};
diagnostic.set_fix(Fix::safe_edit(Edit::range_replacement(

View File

@@ -1,8 +1,8 @@
use ruff_diagnostics::{Diagnostic, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast as ast;
use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::name::QualifiedName;
use ruff_python_ast::{self as ast, ParameterWithDefault};
use ruff_python_semantic::{
analyze::{function_type, visibility},
Scope, ScopeId, ScopeKind,
@@ -102,9 +102,8 @@ pub(crate) fn no_self_use(
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.next()
.map(ParameterWithDefault::as_parameter)
.map(|param| &param.parameter)
else {
return;
};

View File

@@ -51,20 +51,13 @@ pub(crate) fn property_with_parameters(
decorator_list: &[Decorator],
parameters: &Parameters,
) {
let semantic = checker.semantic();
if !decorator_list
.iter()
.any(|decorator| semantic.match_builtin_expr(&decorator.expression, "property"))
{
if parameters.len() <= 1 {
return;
}
if parameters
.posonlyargs
let semantic = checker.semantic();
if decorator_list
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
.count()
> 1
.any(|decorator| semantic.match_builtin_expr(&decorator.expression, "property"))
{
checker
.diagnostics

View File

@@ -61,10 +61,7 @@ impl Violation for TooManyArguments {
pub(crate) fn too_many_arguments(checker: &mut Checker, function_def: &ast::StmtFunctionDef) {
let num_arguments = function_def
.parameters
.args
.iter()
.chain(&function_def.parameters.kwonlyargs)
.chain(&function_def.parameters.posonlyargs)
.iter_non_variadic_params()
.filter(|arg| {
!checker
.settings

View File

@@ -62,14 +62,14 @@ pub(crate) fn too_many_positional(checker: &mut Checker, function_def: &ast::Stm
// Count the number of positional arguments.
let num_positional_args = function_def
.parameters
.args
.posonlyargs
.iter()
.chain(&function_def.parameters.posonlyargs)
.filter(|arg| {
.chain(&function_def.parameters.args)
.filter(|param| {
!checker
.settings
.dummy_variable_rgx
.is_match(&arg.parameter.name)
.is_match(&param.parameter.name)
})
.count();

View File

@@ -26,4 +26,19 @@ property_with_parameters.py:15:9: PLR0206 Cannot have defined parameters for pro
16 | return param + param1
|
property_with_parameters.py:35:9: PLR0206 Cannot have defined parameters for properties
|
33 | class VariadicParameters:
34 | @property
35 | def attribute_var_args(self, *args): # [property-with-parameters]
| ^^^^^^^^^^^^^^^^^^ PLR0206
36 | return sum(args)
|
property_with_parameters.py:39:9: PLR0206 Cannot have defined parameters for properties
|
38 | @property
39 | def attribute_var_kwargs(self, **kwargs): #[property-with-parameters]
| ^^^^^^^^^^^^^^^^^^^^ PLR0206
40 | return {key: value * 2 for key, value in kwargs.items()}
|

View File

@@ -68,7 +68,7 @@ pub(crate) fn deprecated_c_element_tree(checker: &mut Checker, stmt: &Stmt) {
level,
range: _,
}) => {
if level.is_some_and(|level| level > 0) {
if *level > 0 {
// Ex) `import .xml.etree.cElementTree as ET`
} else if let Some(module) = module {
if module == "xml.etree.cElementTree" {

View File

@@ -643,10 +643,10 @@ pub(crate) fn deprecated_import(
stmt: &Stmt,
names: &[Alias],
module: Option<&str>,
level: Option<u32>,
level: u32,
) {
// Avoid relative and star imports.
if level.is_some_and(|level| level > 0) {
if level > 0 {
return;
}
if names.first().is_some_and(|name| &name.name == "*") {

View File

@@ -319,7 +319,7 @@ pub(crate) fn deprecated_mock_import(checker: &mut Checker, stmt: &Stmt) {
level,
..
}) => {
if level.is_some_and(|level| level > 0) {
if *level > 0 {
return;
}

View File

@@ -132,6 +132,9 @@ pub(crate) fn non_pep695_type_alias(checker: &mut Checker, stmt: &StmtAnnAssign)
}
None => None,
},
// We don't handle defaults here yet. Should perhaps be a different rule since
// defaults are only valid in 3.13+.
default: None,
})
})
.collect(),

View File

@@ -42,6 +42,7 @@ mod tests {
#[test_case(Rule::HashlibDigestHex, Path::new("FURB181.py"))]
#[test_case(Rule::ListReverseCopy, Path::new("FURB187.py"))]
#[test_case(Rule::WriteWholeFile, Path::new("FURB103.py"))]
#[test_case(Rule::FStringNumberFormat, Path::new("FURB116.py"))]
#[test_case(Rule::SortedMinMax, Path::new("FURB192.py"))]
fn rules(rule_code: Rule, path: &Path) -> Result<()> {
let snapshot = format!("{}_{}", rule_code.noqa_code(), path.to_string_lossy());

View File

@@ -0,0 +1,181 @@
use ruff_diagnostics::{Diagnostic, Edit, Fix, FixAvailability, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{self as ast, Expr, ExprCall};
use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;
use crate::fix::snippet::SourceCodeSnippet;
/// ## What it does
/// Checks for uses of `bin(...)[2:]` (or `hex`, or `oct`) to convert
/// an integer into a string.
///
/// ## Why is this bad?
/// When converting an integer to a baseless binary, hexadecimal, or octal
/// string, using f-strings is more concise and readable than using the
/// `bin`, `hex`, or `oct` functions followed by a slice.
///
/// ## Example
/// ```python
/// print(bin(1337)[2:])
/// ```
///
/// Use instead:
/// ```python
/// print(f"{1337:b}")
/// ```
#[violation]
pub struct FStringNumberFormat {
replacement: Option<SourceCodeSnippet>,
base: Base,
}
impl Violation for FStringNumberFormat {
const FIX_AVAILABILITY: FixAvailability = FixAvailability::Sometimes;
#[derive_message_formats]
fn message(&self) -> String {
let FStringNumberFormat { replacement, base } = self;
let function_name = base.function_name();
if let Some(display) = replacement
.as_ref()
.and_then(SourceCodeSnippet::full_display)
{
format!("Replace `{function_name}` call with `{display}`")
} else {
format!("Replace `{function_name}` call with f-string")
}
}
fn fix_title(&self) -> Option<String> {
let FStringNumberFormat { replacement, .. } = self;
if let Some(display) = replacement
.as_ref()
.and_then(SourceCodeSnippet::full_display)
{
Some(format!("Replace with `{display}`"))
} else {
Some(format!("Replace with f-string"))
}
}
}
/// FURB116
pub(crate) fn fstring_number_format(checker: &mut Checker, subscript: &ast::ExprSubscript) {
// The slice must be exactly `[2:]`.
let Expr::Slice(ast::ExprSlice {
lower: Some(lower),
upper: None,
step: None,
..
}) = subscript.slice.as_ref()
else {
return;
};
let Expr::NumberLiteral(ast::ExprNumberLiteral {
value: ast::Number::Int(int),
..
}) = lower.as_ref()
else {
return;
};
if *int != 2 {
return;
}
// The call must be exactly `hex(...)`, `bin(...)`, or `oct(...)`.
let Expr::Call(ExprCall {
func, arguments, ..
}) = subscript.value.as_ref()
else {
return;
};
if !arguments.keywords.is_empty() {
return;
}
let [arg] = &*arguments.args else {
return;
};
let Some(id) = checker.semantic().resolve_builtin_symbol(func) else {
return;
};
let Some(base) = Base::from_str(id) else {
return;
};
// Generate a replacement, if possible.
let replacement = if matches!(
arg,
Expr::NumberLiteral(_) | Expr::Name(_) | Expr::Attribute(_)
) {
let inner_source = checker.locator().slice(arg);
let quote = checker.stylist().quote();
let shorthand = base.shorthand();
Some(format!("f{quote}{{{inner_source}:{shorthand}}}{quote}"))
} else {
None
};
let mut diagnostic = Diagnostic::new(
FStringNumberFormat {
replacement: replacement.as_deref().map(SourceCodeSnippet::from_str),
base,
},
subscript.range(),
);
if let Some(replacement) = replacement {
diagnostic.set_fix(Fix::safe_edit(Edit::range_replacement(
replacement,
subscript.range(),
)));
}
checker.diagnostics.push(diagnostic);
}
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
enum Base {
Hex,
Bin,
Oct,
}
impl Base {
/// Returns the shorthand for the base.
fn shorthand(self) -> &'static str {
match self {
Base::Hex => "x",
Base::Bin => "b",
Base::Oct => "o",
}
}
/// Returns the builtin function name for the base.
fn function_name(self) -> &'static str {
match self {
Base::Hex => "hex",
Base::Bin => "bin",
Base::Oct => "oct",
}
}
/// Parses the base from a string.
fn from_str(s: &str) -> Option<Self> {
match s {
"hex" => Some(Base::Hex),
"bin" => Some(Base::Bin),
"oct" => Some(Base::Oct),
_ => None,
}
}
}

Some files were not shown because too many files have changed in this diff Show More