Compare commits

..

164 Commits

Author SHA1 Message Date
Dhruv Manilawala
ea48d1068d Use usize, simplify is_eof 2024-05-01 09:46:31 +05:30
Dhruv Manilawala
afe4cdb691 Update Cursor to use position instead of iterator 2024-05-01 09:24:43 +05:30
plredmond
c391c8b6cb Red Knot - Add symbol flags (#11134)
* Adds `Symbol.flag` bitfield. Populates it from (the three renamed)
`add_or_update_symbol*` methods.
* Currently there are these flags supported:
  * `IS_DEFINED` is set in a scope where a variable is defined.
* `IS_USED` is set in a scope where a variable is referenced. (To have
both this and `IS_DEFINED` would require two separate appearances of a
variable in the same scope-- one def and one use.)
* `MARKED_GLOBAL` and `MARKED_NONLOCAL` are **not yet implemented**.
(*TODO: While traversing, if you find these declarations, add these
flags to the variable.*)
* Adds `Symbol.kind` field (commented) and the data structure which will
populate it: `Kind` which is an enum of freevar, cellvar,
implicit_global, and implicit_local. **Not yet populated**. (*TODO: a
second pass over the scope (or the ast?) will observe the
`MARKED_GLOBAL` and `MARKED_NONLOCAL` flags to populate this field. When
that's added, we'll uncomment the field.*)
* Adds a few tests that the `IS_DEFINED` and `IS_USED` fields are
correctly set and/or merged:
* Unit test that subsequent calls to `add_or_update_symbol` will merge
the flag arguments.
* Unit test that in the statement `x = foo`, the variable `foo` is
considered used but not defined.
* Unit test that in the statement `from bar import foo`, the variable
`foo` is considered defined but not used.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2024-04-29 17:07:23 -07:00
Carl Meyer
ce030a467f [red-knot] resolve base class types (#11178)
## Summary

Resolve base class types, as long as they are simple names.

## Test Plan

cargo test
2024-04-29 16:22:30 -06:00
Dhruv Manilawala
04a922866a Add basic docs for the parser crate (#11199)
## Summary

This PR adds a basic README for the `ruff_python_parser` crate and
updates the CONTRIBUTING docs with the fuzzer and benchmark section.

Additionally, it also updates some inline documentation within the
parser crate and splits the `parse_program` function into
`parse_single_expression` and `parse_module` which will be called by
matching against the `Mode`.

This PR doesn't go into too much internal detail around the parser logic
due to the following reasons:
1. Where should the docs go? Should it be as a module docs in `lib.rs`
or in README?
2. The parser is still evolving and could include a lot of refactors
with the future work (feedback loop and improved error recovery and
resilience)

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-04-29 17:08:07 +00:00
Alex Waygood
0ed7af35ec Add a daily workflow to fuzz the parser with randomly selected seeds (#11203) 2024-04-29 17:54:17 +01:00
Alex Waygood
87929ad5f1 Add convenience methods for iterating over all parameter nodes in a function (#11174) 2024-04-29 10:36:15 +00:00
renovate[bot]
8a887daeb4 Update pre-commit dependencies (#11195)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2024-04-29 08:40:21 +00:00
renovate[bot]
7317d734be Update dependency monaco-editor to ^0.48.0 (#11197)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:34:49 +02:00
renovate[bot]
c1a2a60182 Update NPM Development dependencies (#11196)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:33:12 +02:00
renovate[bot]
8e056b3a93 Update Rust crate serde to v1.0.199 (#11192)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:17:15 +02:00
renovate[bot]
616dd1873f Update Rust crate matchit to v0.8.2 (#11189)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:16:32 +02:00
renovate[bot]
acfb1a83c9 Update Rust crate serde_with to v3.8.1 (#11193)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:16:14 +02:00
renovate[bot]
7c0e32f255 Update Rust crate schemars to v0.8.17 (#11191)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:15:38 +02:00
renovate[bot]
4b84c55e3a Update Rust crate parking_lot to v0.12.2 (#11190)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:14:57 +02:00
renovate[bot]
4c8d33ec45 Update Rust crate hashbrown to v0.14.5 (#11188)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-29 08:14:33 +02:00
Alex Waygood
113e259e6d Various small improvements to the fuzz-parser script (#11186) 2024-04-28 18:17:27 +00:00
Micha Reiser
3474e37836 [red-knot] Unresolved imports lint rule (#11164) 2024-04-28 12:12:49 +02:00
Charlie Marsh
dfe90a3b2b Add a --release build to CI (#11182)
## Summary

We merged a failure here (#11177), and it only takes ~five minutes
anyway (which is shorter than some of our other jobs).
2024-04-27 20:36:33 -04:00
Micha Reiser
00d7c01cfc [red-knot] Fix absolute imports in module.resolve_name (#11180) 2024-04-27 20:07:07 +02:00
Micha Reiser
983a06cec3 [red-knot] Resolve and check dependencies (#11161) 2024-04-27 15:49:03 +00:00
Alex Waygood
47692027bf Fix cargo build --release (#11177)
## Summary

`cargo build --release` currently fails to compile on `main`:

<details>

```
error[E0599]: no method named `hit` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:22:29
    |
22  |             self.statistics.hit();
    |                             ^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `hit` not found for this struct

error[E0599]: no method named `miss` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:25:29
    |
25  |             self.statistics.miss();
    |                             ^^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `miss` not found for this struct

error[E0599]: no method named `hit` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:36:33
    |
36  |                 self.statistics.hit();
    |                                 ^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `hit` not found for this struct

error[E0599]: no method named `miss` found for struct `ReleaseStatistics` in the current scope
   --> crates/red_knot/src/cache.rs:41:33
    |
41  |                 self.statistics.miss();
    |                                 ^^^^ method not found in `ReleaseStatistics`
...
145 | pub struct ReleaseStatistics;
    | ---------------------------- method `miss` not found for this struct
```

</details>

This is because in a release build, `CacheStatistics` is a type alias
for `ReleaseStatistics`, and `ReleaseStatistics` doesn't have `hit()` or
`miss()` methods. (In a debug build, `CacheStatistics` is a type alias
for `DebugStatistics`, which _does_ have those methods.)

Possibly we could make this less likely to happen in the future by
making both structs implement a common trait instead of using type
aliases that vary depending on whether it's a debug build or not? For
now, though, this PR just brings the two structs in sync w.r.t. the
methods they expose.

## Test Plan

`cargo build --release` now once again compiles for me locally
2024-04-27 11:45:32 -04:00
Charlie Marsh
ec3243a6e5 Prioritize redefined-while-unused over unused-import (#11173)
## Summary

This PR adds an override to the fixer to ensure that we apply any
`redefined-while-unused` fixes prior to `unused-import`.

Closes https://github.com/astral-sh/ruff/issues/10905.
2024-04-27 11:44:53 -04:00
Carl Meyer
2d6978f236 [red-knot] fix class vs instance (#11175)
## Summary

Clarify the type of an exact class object vs the type of instances of
that class.

## Test Plan

cargo test
2024-04-27 09:09:02 -06:00
Auguste Lalande
2490d2d4af [ruff] Detect duplicate codes as part of unused-noqa (RUF100) (#10850)
## Summary

Implement duplicate code detection as part of `RUF100`, mirroring the
behavior of `flake8-noqa` (`NQA005`) mentioned in #850. The idea to
merge the rule into `RUF100` was suggested by @MichaReiser
https://github.com/astral-sh/ruff/pull/10325#issuecomment-2025535444.

## Test Plan

Test cases were added to the fixture.
2024-04-27 12:33:44 +00:00
Jelle Zijlstra
59b73fabc1 [pyflakes] Improve invalid-print-syntax documentation (#11171)
This syntax wasn't "deprecated" in Python 3; it was removed.

I started looking at this rule because I was curious how Ruff could even
detect this without a Python 2 parser. Then I realized that
"print >> f, x" is actually valid Python 3 syntax: it creates a tuple
containing a right-shifted version of the print function.
2024-04-27 07:54:04 -04:00
Micha Reiser
61c97a037c red-knot: Introduce program.check (#11148) 2024-04-27 09:01:20 +00:00
Micha Reiser
7cd065e4a2 Kick off Red-knot (#10849)
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Carl Meyer <carl@astral.sh>
2024-04-27 08:34:00 +00:00
Carl Meyer
845ba7cf5f Make ImportFrom level just a u32 (#11170) 2024-04-26 20:38:35 -06:00
Auguste Lalande
5994414739 [ruff] Implement redirected-noqa (RUF101) (#11052)
## Summary

Based on discussion in #10850.

As it stands today `RUF100` will attempt to replace code redirects with
their target codes even though this is not the "goal" of `RUF100`. This
behavior is confusing and inconsistent, since code redirects which don't
otherwise violate `RUF100` will not be updated. The behavior is also
undocumented. Additionally, users who want to use `RUF100` but do not
want to update redirects have no way to opt out.

This PR explicitly detects redirects with a new rule `RUF101` and
patches `RUF100` to keep original codes in fixes and reporting.

## Test Plan

Added fixture.
2024-04-27 02:08:11 +00:00
Jane Lewis
632965d0fa ruff server: Support a custom TOML configuration file (#11140)
## Summary

Closes #10985.

The server now supports a custom TOML configuration file as a client
setting. The setting must be an absolute path to a file. If the file is
called `pyproject.toml`, the server will attempt to parse it as a
pyproject file - otherwise, it will attempt to parse it as a `ruff.toml`
file, even if the file has a name besides `ruff.toml`.

If an option is set in both the custom TOML configuration file and in
the client settings directly, the latter will be used.

## Test Plan

1. Create a `ruff.toml` file outside of the workspace you are testing.
Set an option that is different from the one in the configuration for
your test workspace.
2. Set the path to the configuration in NeoVim:
```lua
require('lspconfig').ruff.setup {
    init_options = {
      settings = {
        configuration = "absolute/path/to/your/configuration"
      }
    }
}
```
3. Confirm that the option in the configuration file is used, regardless
of what the option is set to in the workspace configuration.
4. Add the same option, with a different value, to the NeoVim
configuration directly. For example:
```lua
require('lspconfig').ruff.setup {
    init_options = {
      settings = {
        configuration = "absolute/path/to/your/configuration",
        lint = {
          select = []
        }
      }
    }
}
```
5. Confirm that the option set in client settings is used, regardless of
the value in either the custom configuration file or in the workspace
configuration.
2024-04-26 23:46:07 +00:00
Dhruv Manilawala
77a72ecd38 Avoid multiline expression if format specifier is present (#11123)
## Summary

This PR fixes the bug where the formatter would format an f-string and
could potentially change the AST.

For a triple-quoted f-string, the element can't be formatted into
multiline if it has a format specifier because otherwise the newline
would be treated as part of the format specifier.

Given the following f-string:
```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f} ddddddddddddddd eeeeeeee"""
```

The formatter sees that the f-string is already multiline so it assumes
that it can contain line breaks i.e., broken into multiple lines. But,
in this specific case we can't format it as:

```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f
} ddddddddddddddd eeeeeeee"""
```
                     
Because the format specifier string would become ".3f\n", which is not
the original string (`.3f`).

If the original source code already contained a newline, they'll be
preserved. For example:
```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f
} ddddddddddddddd eeeeeeee"""
```

The above will be formatted as:
```py
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {variable:.3f
} ddddddddddddddd eeeeeeee"""
```

Note that the newline after `.3f` is part of the format specifier which
needs to be preserved.
The Python version is irrelevant in this case.

fixes: #10040 

## Test Plan

Add some test cases to verify this behavior.
2024-04-26 13:34:38 +00:00
Jane Lewis
16a1f3cbcc ruff server: Support setting to prioritize project configuration over editor configuration (#11086)
## Summary

This is intended to address
https://github.com/astral-sh/ruff-vscode/issues/425, and is a follow-up
to https://github.com/astral-sh/ruff/pull/11062.

A new client setting is now supported by the server,
`prioritizeFileConfiguration`. This is a boolean setting (default:
`false`) that, if set to `true`, will instruct the configuration
resolver to prioritize file configuration (aka discovered TOML files)
over configuration passed in by the editor.

A corresponding extension PR has been opened, which makes this setting
available for VS Code:
https://github.com/astral-sh/ruff-vscode/pull/457.

## Test Plan

To test this with VS Code, you'll need to check out [the VS Code
PR](https://github.com/astral-sh/ruff-vscode/pull/457) that adds this
setting.

The test process is similar to
https://github.com/astral-sh/ruff/pull/11062, but in scenarios where the
editor configuration would take priority over file configuration, file
configuration should take priority.
2024-04-26 08:17:28 +00:00
Jelle Zijlstra
cd3e319538 Add support for PEP 696 syntax (#11120) 2024-04-26 09:47:29 +02:00
Shi Sheng
45725d3275 Update README.md (#11156)
## Summary

Sorry about an oversight, fixed the LICENSE redirection near the bottom
part of the README file. Now also is the full path
2024-04-25 22:46:47 -04:00
Shi Sheng
bbca8eb388 Update README.md (#11155)
## Summary

Modified the license badge so instead of `LICENSE`, it is now the full
path, and hoepfully will resolve the issue appeared in PyPI with the
license badge redirection
2024-04-25 22:14:21 -04:00
Steve C
c8c227dd5d [refurb] Implement fstring-number-format (FURB116) (#10921)
## Summary

Adds `FURB116`

See #1348 

## Test Plan

`cargo test`
2024-04-26 01:15:33 +00:00
Charlie Marsh
b15e9e6e05 Include inline instantiations when detecting loggers (#11154)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11031.
2024-04-25 21:00:12 -04:00
Ibraheem Ahmed
22d4f11348 Upgrade codspeed-criterion-compat to 2.6.0 (#11153)
## Summary

`codspeed-criterion-compat` 2.6.0 [updates it's package selection
mechanism](https://github.com/CodSpeedHQ/codspeed-rust/pull/43), so
https://github.com/astral-sh/ruff/pull/10735 is no longer needed.

## Test Plan

CI should still be passing.
2024-04-25 20:22:37 -04:00
Alex Waygood
269014a539 Delete unused methods from Parameters (#11150) 2024-04-25 22:11:24 +01:00
Charlie Marsh
dc09f529bc Build a separate ARM wheel for macOS (#11149)
## Summary

Since we already build an x86 wheel, we can just build an ARM wheel
rather than cross-compiling to universal.

The build time is ~3 minutes vs. > 20 minutes and the resulting artifact
is much smaller, which is also a win for users.
2024-04-25 15:06:47 -04:00
Auguste Lalande
3364ef957d [pygrep_hooks] Fix blanket-noqa panic when last line has noqa with no newline (PGH004) (#11108)
## Summary

Resolves #11102

The error stems from these lines

f5c7a62aa6/crates/ruff_linter/src/noqa.rs (L697-L702)
I don't really understand the purpose of incrementing the last index,
but it makes the resulting range invalid for indexing into `contents`.

For now I just detect if the index is too high in `blanket_noqa` and
adjust it if necessary.

## Test Plan

Created fixture from issue example.
2024-04-25 14:39:38 -04:00
Jane Lewis
77c93fd63c Bump version to 0.4.2 (#11151) 2024-04-25 17:31:38 +00:00
Dhruv Manilawala
1c9f5e3001 Display the AST even with syntax errors (#11147)
## Summary

This PR updates the playground to display the AST even if it contains a
syntax error. This could be useful for development and also to give a
quick preview of what error recovery looks like.

Note that not all recovery is correct but this allows us to iterate
quickly on what can be improved.

## Test Plan

Build the playground locally and test it.

<img width="1688" alt="Screenshot 2024-04-25 at 21 02 22"
src="https://github.com/astral-sh/ruff/assets/67177269/2b94934c-4f2c-4a9a-9693-3d8460ed9d0b">
2024-04-25 21:55:23 +05:30
Charlie Marsh
263a0d25ed Use macos-12 to build release wheels (#11146)
## Summary

GitHub has started to change `macos-latest` to `macos-14`. But
executables built on `macos-14` don't work on macOS 11 (see:
https://github.com/astral-sh/uv/issues/3261). This PR explicitly uses
`macos-12` instead (which is what we _intended_ to be using anyway).
2024-04-25 12:07:25 -04:00
Dhruv Manilawala
4738e19974 Remove unused lexical error types (#11145) 2024-04-25 15:24:16 +00:00
bersbersbers
f428bd5052 Docs: mention lint.typing-modules in TCH001, TCH002, TCH003 (#11144)
## Summary

Mention `lint.typing-modules` in `TCH001`, `TCH002`, `TCH003`; close
#11142.
2024-04-25 11:23:03 -04:00
Jane Lewis
4690890e9f ruff server: In 'publish diagnostics' mode, document diagnostics are cleared properly when a file is closed (#11137)
## Summary

Fixes #11114. 

As part of the `onClose` handler, we publish an empty array of
diagnostics for the document being closed, similar to
[`ruff-lsp`](187d7790be/ruff_lsp/server.py (L459-L464)).
This prevent phantom diagnostics from lingering after a document is
closed. We'll only do this if the client doesn't support pull
diagnostics, because otherwise clearing diagnostics is their
responsibility.

## Test Plan

Diagnostics should no longer appear for a document in the Problems tab
after the document is closed.
2024-04-24 19:38:54 -07:00
Maxime Beauchemin
19baabba58 README: add Apache Superset to project list (#11136)
## Summary

Stoked to have found` ruff` and migrated over - quick testimonial ->

> Our smooth and fast migration to ruff allowed us to deprecate 4+ other
disjointed tools (pycln, pyupgrade, black, isort, flake8) that each had
their own CLI subcommands, different enabling/disabling semantics, and
pre-commit hooks. I can't wait to add pylint to the list. It's such a no
brainer to replace all this with the Ferrari of linters.`--add-no-qa`
and `--ignore-noqa` are so convenient, but `--fix` takes it home!
2024-04-24 21:39:39 -04:00
Sid
cee38f39df [flake8-blind-expect] Allow raise from in BLE001 (#11131)
## Summary

This allows `raise from` in BLE001.

```python
try:
    ...
except Exception as e:
    raise ValueError from e
```

Fixes #10806

## Test Plan

Test case added.
2024-04-24 11:56:11 -04:00
Alex Waygood
e3fde28146 [flake8-pyi] Allow overloaded __exit__ and __aexit__ definitions (PYI036) (#11057) 2024-04-24 15:48:52 +00:00
Ibraheem Ahmed
1c8849f9a8 Use Matchit to Resolve Per-File Settings (#11111)
## Summary

Continuation of https://github.com/astral-sh/ruff/pull/9444.

> When the formatter is fully cached, it turns out we actually spend
meaningful time mapping from file to `Settings` (since we use a
hierarchical approach to settings). Using `matchit` rather than
`BTreeMap` improves fully-cached performance by anywhere from 2-5%
depending on the project, and since these are all implementation details
of `Resolver`, it's minimally invasive.

`matchit` supports escaping routing characters so this change should now
be fully compatible.

## Test Plan

On my machine I'm seeing a ~3% improvement with this change.

```
hyperfine --warmup 20 -i "./target/release/main format ../airflow" "./target/release/ruff format ../airflow"
Benchmark 1: ./target/release/main format ../airflow
  Time (mean ± σ):      58.1 ms ±   1.4 ms    [User: 63.1 ms, System: 66.5 ms]
  Range (min … max):    56.1 ms …  62.9 ms    49 runs
 
Benchmark 2: ./target/release/ruff format ../airflow
  Time (mean ± σ):      56.6 ms ±   1.5 ms    [User: 57.8 ms, System: 67.7 ms]
  Range (min … max):    54.1 ms …  63.0 ms    51 runs
 
Summary
  ./target/release/ruff format ../airflow ran
    1.03 ± 0.04 times faster than ./target/release/main format ../airflow
```
2024-04-24 10:25:46 -04:00
Alex Waygood
37af6e6147 [flake8-pyi] Allow simple assignments to None in enum class scopes (PYI026) (#11128) 2024-04-24 15:13:55 +01:00
Micha Reiser
92814fd99b Use crossbeam-channel instead of crossbeam (#11129) 2024-04-24 13:56:55 +00:00
Shi Sheng
c9c2e7b978 Fix link to license in README (#11124) 2024-04-24 15:51:07 +02:00
Charlie Marsh
51dec8d95b [flake8-simplify] Avoid raising SIM911 for non-zip attribute calls (#11126)
## Summary

The `matches!` macro here is slightly off and allows _all_ attribute
calls through.

Closes https://github.com/astral-sh/ruff/issues/11125.
2024-04-24 13:05:17 +00:00
Nolan
7c8c1c71a3 Implement hover menu support for ruff-server; Issue #10595 (#11096)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Add support for hover menu to ruff_server, as requested in
[10595](https://github.com/astral-sh/ruff/issues/10595).
Majority of new code is in hover.rs.
I reused the regex from ruff-lsp's implementation. Also reused the
format_rule_text function from ruff/src/commands/rule.rs
Added capability registration in server.rs, and added the handler to
api.rs.

## Test Plan

Tested in NVIM v0.10.0-dev-2582+g2a8cef6bd, configured with lspconfig
using the default options (other than cmd pointing to my test build,
with options "server" and "--preview"). OS: Ubuntu 24.04, kernel
6.8.0-22.

---------

Co-authored-by: Jane Lewis <me@jane.engineering>
2024-04-23 20:16:02 +00:00
Alex Waygood
455d22cdc8 Fix syntax for some if-conditions in ci.yaml (#11109) 2024-04-23 19:46:58 +01:00
Jane Lewis
35ca887e02 ruff server: Ruff configuration from client settings overrides project configuration (#11062)
## Summary

This is a follow-up to https://github.com/astral-sh/ruff/pull/10984 that
implements configuration resolution for editor configuration. By 'editor
configuration', I'm referring to the client settings that correspond to
Ruff configuration/options, like `preview`, `select`, and so on. These
will be combined with 'project configuration' (configuration taken from
project files such as `pyproject.toml`) to generate the final linter and
formatter settings used by `RuffSettings`. Editor configuration takes
priority over project configuration.

In a follow-up pull request, I'll implement a new client setting that
allows project configuration to override editor configuration, as per
[this issue](https://github.com/astral-sh/ruff-vscode/issues/425).

## Review guide

The first commit, e38966d8843becc7234fa7d46009c16af4ba41e9, is just
doing re-arrangement so that we can pass the right things to
`RuffSettings::resolve`. The actual resolution logic is in the second
commit, 0eec9ee75c10e5ec423bd9f5ce1764f4d7a5ad86. It might help to look
at these comments individually since the diff is rather messy.

## Test Plan

For the settings to show up in VS Code, you'll need to checkout this
branch: https://github.com/astral-sh/ruff-vscode/pull/456.

To test that the resolution for a specific setting works as expected,
run through the following scenarios, setting it in project and editor
configuration as needed:

| Set in project configuration? | Set in editor configuration? |
Expected Outcome |

|-------------------------------|--------------------------------------------------|------------------------------------------------------------------------------------------|
| No | No | The editor should behave as if the setting was set to its
default value. |
| Yes | No | The editor should behave as if the setting was set to the
value in project configuration. |
| No | Yes | The editor should behave as if the setting was set to the
value in editor configuration. |
| Yes | Yes (but distinctive from project configuration) | The editor
should behave as if the setting was set to the value in editor
configuration. |

An exception to this is `extendSelect`, which does not have an analog in
TOML configuration. Instead, you should verify that `extendSelect`
amends the `select` setting. If `select` is set in both editor and
project configuration, `extendSelect` will only append to the `select`
value in editor configuration, so make sure to un-set it there if you're
testing `extendSelect` with `select` in project configuration.
2024-04-23 11:19:17 -07:00
Alex Waygood
f5c7a62aa6 Add a new CI job to fuzz the parser (#11089) 2024-04-23 10:24:04 +00:00
Dhruv Manilawala
38d2562f41 Refactor unary expression parsing (#11088)
## Summary

This PR refactors unary expression parsing with the following changes:
* Ability to get `OperatorPrecedence` from a unary operator (`UnaryOp`)
* Implement methods on `TokenKind`
	* Add `as_unary_operator` which returns an `Option<UnaryOp>`
* Add `as_unary_arithmetic_operator` which returns an `Option<UnaryOp>`
(used for pattern parsing)
* Rename `is_unary` to `is_unary_arithmetic_operator` (used in the
linter)

resolves: #10752 

## Test Plan

Verify that the existing test cases pass, no ecosystem changes, run the
Python based fuzzer on 3000 random inputs and run it on dozens of
open-source repositories.
2024-04-23 04:55:02 +00:00
Dhruv Manilawala
7eba967e16 Refactor binary expression parsing (#11073)
## Summary

This PR refactors the binary expression parsing in a way to make it
readable and easy to understand. It draws inspiration from the suggested
edits in the linked messages in #10752.

### Changes

* Ability to get the precedence of an operator
	* From a boolean operator (`BinOp`) to `OperatorPrecedence`
	* From a binary operator (`Operator`) to `OperatorPrecedence`
	* No comparison operator because all of them have the same precedence
* Implement methods on `TokenKind` to convert it to an appropriate
operator enum
	* Add `as_boolean_operator` which returns an `Option<BoolOp>`
	* Add `as_binary_operator` which returns an `Option<Operator>`
* No `as_comparison_operator` because it requires lookahead and I'm not
sure if `token.as_comparison_operator(peek)` is a good way to implement
it
* Introduce `BinaryLikeOperator`
	* Constructed from two tokens using the methods from the second point
* Add `precedence` method using the conversion methods mentioned in the
first point
* Make most of the functions in `TokenKind` private to the module
* Use `self` instead of `&self` for `TokenKind` 

fixes: #11072

## Test Plan

Refer #11088
2024-04-23 04:42:40 +00:00
Dhruv Manilawala
5b81b8368d Make associativity a property of operator precedence (#11065)
## Summary

This PR does a few things but the main change is that is makes
associativity a property of operator precedence.

1. Rename `Precedence` -> `OperatorPrecedence`
2. Rename `parse_expression_with_precedence` ->
`parse_binary_expression_or_higher`
3. Move `current_binding_power` to `OperatorPrecedence::try_from_tokens`
[^1]
4. Add a `OperatorPrecedence::is_right_associative` method
5. Move from `increment_precedence` to using `<=` / `<` to check if the
parsing loop needs to stop [^2]

[^1]: Another alternative would be to have two separate methods to avoid
lookahead as it's required only for once case (`not in`). So,
`try_from_current_token(current).or_else(|| try_from_next_token(current,
peek))`
[^2]: This will allow us to easily make the refactors mentioned in
#10752

## Test Plan

Make sure the precedence parsing algorithm is still correct by running
the test suite, fuzz testing it and running it against a dozen or so
open-source repositories.
2024-04-23 04:28:46 +00:00
Dhruv Manilawala
c30735d4a7 Add ExpressionContext for expression parsing (#11055)
## Summary

This PR adds a new `ExpressionContext` struct which is used in
expression parsing.

This solves the following problem:
1. Allowing starred expression with different precedence
2. Allowing yield expression in certain context
3. Remove ambiguity with `in` keyword when parsing a `for ... in`
statement

For context, (1) was solved by adding `parse_star_expression_list` and
`parse_star_expression_or_higher` in #10623, (2) was solved by by adding
`parse_yield_expression_or_else` in #10809, and (3) was fixed in #11009.
All of the mentioned functions have been removed in favor of the context
flags.

As mentioned in #11009, an ideal solution would be to implement an
expression context which is what this PR implements. This is passed
around as function parameter and the call stack is used to automatically
reset the context.

### Recovery

How should the parser recover if the target expression is invalid when
an expression can consume the `in` keyword?

1. Should the `in` keyword be part of the target expression?
2. Or, should the expression parsing stop as soon as `in` keyword is
encountered, no matter the expression?

For example:
```python
for yield x in y: ...

# Here, should this be parsed as
for (yield x) in (y): ...
# Or
for (yield x in y): ...
# where the `in iter` part is missing
```

Or, for binary expression parsing:
```python
for x or y in z: ...

# Should this be parsed as
for (x or y) in z: ...
# Or
for (x or y in z): ...
# where the `in iter` part is missing
```

This need not be solved now, but is very easy to change. For context
this PR does the following:
* For binary, comparison, and unary expressions, stop at `in`
* For lambda, yield expressions, consume the `in`

## Test Plan

1. Add test cases for the `for ... in` statement and verify the
snapshots
2. Make sure the existing test suite pass
3. Run the fuzzer for around 3000 generated source code
4. Run the updated logic on a dozen or so open source repositories
(codename "parser-checkouts")
2024-04-23 04:19:05 +00:00
Jane Lewis
62478c3070 ruff server: Support publish diagnostics as a fallback when pull diagnostics aren't supported (#11092)
## Summary

Fixes #11059 

Several major editors don't support [pull
diagnostics](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_pullDiagnostics),
a method of sending diagnostics to the client that was introduced in
version `0.3.17` of the specification. Until now, `ruff server` has only
used pull diagnostics, which resulted in diagnostics not being available
on Neovim and Helix, which don't support pull diagnostics yet (though
Neovim `10.0` will have support for this).

`ruff server` will now utilize the older method of sending diagnostics,
known as 'publish diagnostics', when pull diagnostics aren't supported
by the client. This involves re-linting a document every time it is
opened or modified, and then sending the diagnostics generated from that
lint to the client via the `textDocument/publishDiagnostics`
notification.

## Test Plan

The easiest way to test that this PR works is to check if diagnostics
show up on Neovim `<=0.9`.
2024-04-22 21:06:35 -07:00
Ottavio Hartman
111bbc61f6 [refurb] New rule to suggest min/max over sorted() (FURB192) (#10868)
## Summary

Fixes #10463

Add `FURB192` which detects violations like this:

```python
# Bad
a = sorted(l)[0]

# Good
a = min(l)
```

There is a caveat that @Skylion007 has pointed out, which is that
violations with `reverse=True` technically aren't compatible with this
change, in the edge case where the unstable behavior is intended. For
example:

```python
from operator import itemgetter
data = [('red', 1), ('blue', 1), ('red', 2), ('blue', 2)]

min(data, key=itemgetter(0))  # ('blue', 1)
sorted(data, key=itemgetter(0))[0]  # ('blue', 1)
sorted(data, key=itemgetter(0), reverse=True)[-1]  # ('blue, 2')
```

This seems like a rare edge case, but I can make the `reverse=True`
fixes unsafe if that's best.

## Test Plan

This is unit tested.

## References

https://github.com/dosisod/refurb/pull/333/files

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-04-23 01:13:57 +00:00
Charlie Marsh
925c7f8dd3 [refurb] Advoid operator.itemgetter suggestion for single-item tuple (#11095)
## Summary

The `operator.itemgetter` behavior changes where there's more than one
argument, such that `operator.itemgetter(0)` yields `r[0]`, rather than
`(r[0],)`.

Closes https://github.com/astral-sh/ruff/issues/11075.
2024-04-23 00:20:43 +00:00
KotlinIsland
5b4c8a7c5f (📚) fix docs for invalid-X-returns (#11094)
## Summary

There is no class `integer` in python, nor is there a type `integer`, so
I updated the docs to remove the backticks on these references, such
that it is the representation of an integer, and not a reference.
2024-04-23 00:17:47 +00:00
Auguste Lalande
647548b5e7 [pygrep_hooks] Move blanket-noqa to noqa checker (PGH004) (#11053)
## Summary

Move `blanket-noqa` rule from the token checker to the noqa checker.
This allows us to make use of the line directives already computed in
the noqa checker.

## Test Plan

Verified test results are unchanged.
2024-04-22 13:36:25 -04:00
plredmond
a9919707d4 [UP031] When encountering "%s" % var offer unsafe fix (#11019)
Resolves #10187

<details>
<summary>Old PR description; accurate through commit e86dd7d; probably
best to leave this fold closed</summary>

## Description of change

In the case of a printf-style format string with only one %-placeholder
and a variable at right (e.g. `"%s" % var`):

* The new behavior attempts to dereference the variable and then match
on the bound expression to distinguish between a 1-tuple (fix), n-tuple
(bug 🐛), or a non-tuple (fix). Dereferencing is via
`analyze::typing::find_binding_value`.
* If the variable cannot be dereferenced, then the type-analysis routine
is called to distinguish only tuple (no-fix) or non-tuple (fix). Type
analysis is via `analyze::typing::is_tuple`.
* If any of the above fails, the rule still fires, but no fix is
offered.

## Alternatives

* If the reviewers think that singling out the 1-tuple case is too
complicated, I will remove that.
* The ecosystem results show that no new fixes are detected. So I could
probably delete all the variable dereferencing code and code that tries
to generate fixes, tbh.

## Changes to existing behavior

**All the previous rule-firings and fixes are unchanged except for** the
"false negatives" in
`crates/ruff_linter/resources/test/fixtures/pyupgrade/UP031_1.py`. Those
previous "false negatives" are now true positives and so I moved them to
`crates/ruff_linter/resources/test/fixtures/pyupgrade/UP031_0.py`.

<details>
<summary>Existing false negatives that are now true positives</summary>

```
crates/ruff_linter/resources/test/fixtures/pyupgrade/UP031_0.py:134:1: UP031 Use format specifiers instead of percent format
    |
133 | # UP031 (no longer false negatives)
134 | 'Hello %s' % bar
    | ^^^^^^^^^^^^^^^^ UP031
135 |
136 | 'Hello %s' % bar.baz
    |
    = help: Replace with format specifiers

crates/ruff_linter/resources/test/fixtures/pyupgrade/UP031_0.py:136:1: UP031 Use format specifiers instead of percent format
    |
134 | 'Hello %s' % bar
135 |
136 | 'Hello %s' % bar.baz
    | ^^^^^^^^^^^^^^^^^^^^ UP031
137 |
138 | 'Hello %s' % bar['bop']
    |
    = help: Replace with format specifiers

crates/ruff_linter/resources/test/fixtures/pyupgrade/UP031_0.py:138:1: UP031 Use format specifiers instead of percent format
    |
136 | 'Hello %s' % bar.baz
137 |
138 | 'Hello %s' % bar['bop']
    | ^^^^^^^^^^^^^^^^^^^^^^^ UP031
    |
    = help: Replace with format specifiers
```
One of them newly offers a fix.
```
 # UP031 (no longer false negatives)
-'Hello %s' % bar
+'Hello {}'.format(bar)
```
This fix occurs because the new code dereferences `bar` to where it was
defined earlier in the file as a non-tuple:
```python
bar = {"bar": y}
```

---

</details>

## Behavior requiring new tests

Additionally, we now handle a few cases that we didn't previously test.
These cases are when a string has a single %-placeholder and the
righthand operand to the modulo operator is a variable **which can be
dereferenced.** One of those was shown in the previous section (the
"dereference non-tuple" case).

<details>
<summary>New cases handled</summary>

```
crates/ruff_linter/resources/test/fixtures/pyupgrade/UP031_0.py:126:1: UP031 [*] Use format specifiers instead of percent format
    |
125 | t1 = (x,)
126 | "%s" % t1
    | ^^^^^^^^^ UP031
127 | # UP031: deref t1 to 1-tuple, offer fix
    |
    = help: Replace with format specifiers

crates/ruff_linter/resources/test/fixtures/pyupgrade/UP031_0.py:130:1: UP031 Use format specifiers instead of percent format
    |
129 | t2 = (x,y)
130 | "%s" % t2
    | ^^^^^^^^^ UP031
131 | # UP031: deref t2 to n-tuple, this is a bug
    |
    = help: Replace with format specifiers
```
One of these offers a fix.
```
 t1 = (x,)
-"%s" % t1
+"{}".format(t1[0])
 # UP031: deref t1 to 1-tuple, offer fix
```
The other doesn't offer a fix because it's a bug.

---

</details>

---

</details>


## Changes to existing behavior

In the case of a string with a single %-placeholder and a single
ambiguous righthand argument to the modulo operator, (e.g. `"%s" % var`)
the rule now fires and offers a fix. We explain about this in the "fix
safety" section of the updated documentation.


## Documentation changes

I swapped the order of the "known problems" and the "examples" sections
so that the examples which describe the rule are first, before the
exceptions to the rule are described. I also tweaked the language to be
more explicit, as I had trouble understanding the documentation at
first. The "known problems" section is now "fix safety" but the content
is largely similar.

The diff of the documentation changes looks a little difficult unless
you look at the individual commits.
2024-04-22 08:40:51 -07:00
Alex Waygood
5dcb1d9e8c Improvements to the fuzz-parser script (#11071)
## Summary

- Properly fix the race condition identified in
https://github.com/astral-sh/ruff/pull/11039. Instead of running the
version of Ruff we're testing by invoking `cargo run --release` on each
generated source file, we either (1) accept a path to an executable on
the command line or (2) if that's not specified, we run `cargo build
--release` once at the start and then invoke the executable found in
`target/release/ruff` directly.
- Now that the race condition is properly fixed, remove the workaround
for the race condition added in
https://github.com/astral-sh/ruff/pull/11039.
- Also allow users to pass in an executable to compare against for the
`--only-new-bugs` argument (previously it was hardcoded to always
compare against the version of Ruff installed into the Python
environment)
- Use `argparse.RawDescriptionHelpFormatter` as the formatter class
rather than `argparse.RawTextHelpFormatter`. This means that long help
texts for the individual arguments will be wrapped to a sensible width.
- On completion of the script, indicate success or failure of the script
overall by raising `SytemExit` with the appropriate exit code.
- Add myself as a codeowner for the script
2024-04-22 07:46:58 +01:00
renovate[bot]
0b92f450ca Update Rust crate codspeed-criterion-compat to v2.5.0 (#11087)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-22 08:33:05 +02:00
renovate[bot]
54d42957b0 Update dependency react-resizable-panels to v2.0.18 (#11081)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-22 08:32:24 +02:00
renovate[bot]
d3ed0ad68c Update Rust crate thiserror to v1.0.59 (#11080)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-22 08:31:51 +02:00
renovate[bot]
01678a990c Update Rust crate syn to v2.0.60 (#11079)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-22 08:31:33 +02:00
renovate[bot]
45692ce89f Update Rust crate quick-junit to 0.4.0 (#11084) 2024-04-22 07:13:09 +01:00
renovate[bot]
4246111f67 Update Rust crate proc-macro2 to v1.0.81 (#11076) 2024-04-22 07:10:37 +01:00
renovate[bot]
9924bd774a Update Rust crate serde to v1.0.198 (#11077) 2024-04-21 22:50:35 -05:00
renovate[bot]
fc7f07bca7 Update Rust crate serde_json to v1.0.116 (#11078) 2024-04-21 22:50:29 -05:00
renovate[bot]
bd6ca3d586 Update pre-commit dependencies (#11082) 2024-04-21 20:52:27 -05:00
renovate[bot]
23f8e1c3c8 Update NPM Development dependencies (#11083) 2024-04-21 20:52:13 -05:00
Jonathan Plasse
a68938897d [ruff] fix async comprehension false positive (RUF029) (#11070)
## Summary

- Fix #11043 

## Test Plan

Added the false positive code in the test fixture.
2024-04-21 08:17:25 -04:00
Charlie Marsh
d544199272 Respect per-file-ignores for RUF100 with no other diagnostics (#11058)
## Summary

The existing test didn't cover the case in which there are _no_ other
diagnostics in the file.

Closes https://github.com/astral-sh/ruff/issues/10906.
2024-04-20 15:33:22 +00:00
Carl Meyer
c80b9a4a90 Reduce size of Stmt from 144 to 120 bytes (#11051)
## Summary

I happened to notice that we box `TypeParams` on `StmtClassDef` but not
on `StmtFunctionDef` and wondered why, since `StmtFunctionDef` is bigger
and sets the size of `Stmt`.

@charliermarsh found that at the time we started boxing type params on
classes, classes were the largest statement type (see #6275), but that's
no longer true.

So boxing type-params also on functions reduces the overall size of
`Stmt`.

## Test Plan

The `<=` size tests are a bit irritating (since their failure doesn't
tell you the actual size), but I manually confirmed that the size is
actually 120 now.
2024-04-19 17:02:17 -06:00
Charlie Marsh
99f7f94538 Improve documentation around custom isort sections (#11050)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11047.
2024-04-19 22:26:55 +00:00
James Frost
7b3c92a979 [flake8-bugbear] Document explicitly disabling strict zip (B905) (#11040)
Occasionally you intentionally have iterables of differing lengths. The
rule permits this by explicitly adding `strict=False`, but this was not
documented.

## Summary

The rule does not currently document how to avoid it when having
differing length iterables is intentional. This PR adds that to the rule
documentation.
2024-04-19 13:50:18 +00:00
Alex Waygood
fdbcb62adc scripts/fuzz-parser: work around race condition from running cargo build concurrently (#11039) 2024-04-19 14:42:28 +01:00
Dhruv Manilawala
0ff25a540c Bump version to 0.4.1 (#11035)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-04-19 17:42:02 +05:30
Alex Waygood
34873ec009 Add a script to fuzz the parser (courtesy of pysource-codegen) (#11015) 2024-04-19 12:40:36 +01:00
Dhruv Manilawala
d3cd61f804 Use empty range when there's "gap" in token source (#11032)
## Summary

This fixes a bug where the parser would panic when there is a "gap" in
the token source.

What's a gap?

The reason it's `<=` instead of just `==` is because there could be
whitespaces between
the two tokens. For example:

```python
#     last token end
#     | current token (newline) start
#     v v
def foo \n
#      ^
#      assume there's trailing whitespace here
```

Or, there could tokens that are considered "trivia" and thus aren't
emitted by the token
source. These are comments and non-logical newlines. For example:

```python
#     last token end
#     v
def foo # comment\n
#                ^ current token (newline) start
```

In either of the above cases, there's a "gap" between the end of the
last token and start
of the current token.

## Test Plan

Add test cases and update the snapshots.
2024-04-19 11:36:26 +00:00
Alex Waygood
9b80cc09ee Select fewer ruff rules when linting Python files in scripts/ (#11034) 2024-04-19 12:33:36 +01:00
Dhruv Manilawala
9bb23b0a38 Expect indented case block instead of match stmt (#11033)
## Summary

This PR adds a new `Clause::Case` and uses it to parse the body of a
`case` block. Earlier, it was using `Match` which would give an
incorrect error message like:

```
  |
1 | match subject:
2 |     case 1:
3 |     case 2: ...
  |     ^^^^ Syntax Error: Expected an indented block after `match` statement
  |
```

## Test Plan

Add test case and update the snapshot.
2024-04-19 16:46:15 +05:30
Charlie Marsh
06c248a126 [ruff] Ignore stub functions in unused-async (RUF029) (#11026)
## Summary

We should ignore methods that appear to be stubs, e.g.:

```python
async def foo() -> int: ...
```

Closes https://github.com/astral-sh/ruff/issues/11018.
2024-04-19 00:03:52 -04:00
Tibor Reiss
27902b7130 [pylint] Implement invalid-index-returned (PLE0305) (#10962)
Add pylint rule invalid-index-returned (PLE0305)

See https://github.com/astral-sh/ruff/issues/970 for rules

Test Plan: `cargo test`
2024-04-19 03:44:05 +00:00
Henry Asa
97acf1d59b ENH: Bump ruff dependency versions to support the latest release of v0.4.0 and Python 3.12 (#11025)
## Summary

With the release of
[`v0.4.0`](https://github.com/astral-sh/ruff/releases/tag/v0.4.0) of
`ruff`, I noticed that some of `ruff`'s dependencies were not updated to
their latest versions. The
[`ruff-pre-commit`](https://github.com/astral-sh/ruff-pre-commit)
package released
[`v0.4.0`](https://github.com/astral-sh/ruff-pre-commit/releases/tag/v0.4.0)
at the same time `ruff` was updated, but `ruff` still referenced
`v0.3.7` of the package, not the newly updated version. I updated the
`ruff-pre-commit` reference to be `v0.4.0`.

In a similar light, I noticed that the version of the
[`dill`](https://github.com/uqfoundation/dill) package being used was
not the latest version. I bumped `dill` from version `0.3.7` to `0.3.8`,
which now [fully supports Python
3.12](https://github.com/uqfoundation/dill/releases/tag/0.3.8).

## Related Issues

Resolves #11024
2024-04-19 03:37:54 +00:00
Tibor Reiss
adf63d9013 [pylint] Implement invalid-hash-returned (PLE0309) (#10961)
Add pylint rule invalid-hash-returned (PLE0309)

See https://github.com/astral-sh/ruff/issues/970 for rules

Test Plan: `cargo test`

TBD: from the description: "Strictly speaking `bool` is a subclass of
`int`, thus returning `True`/`False` is valid. To be consistent with
other rules (e.g.
[PLE0305](https://github.com/astral-sh/ruff/pull/10962)
invalid-index-returned), ruff will raise, compared to pylint which will
not raise."
2024-04-19 03:33:52 +00:00
MithicSpirit
5d3c9f2637 ruff server: fix Neovim setup guide command (#11021) 2024-04-19 08:24:09 +05:30
Charlie Marsh
33529c049e Allow NoReturn-like functions for __str__, __len__, etc. (#11017)
## Summary

If the method always raises, we shouldn't raise a diagnostic for
"returning a value of the wrong type".

Closes https://github.com/astral-sh/ruff/issues/11016.
2024-04-18 22:55:15 +00:00
Zanie Blue
e751b4ea82 Bump version to 0.4.0 (#11011)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-04-18 19:10:28 +00:00
Dhruv Manilawala
25a9131109 Add myself as codeowner for the parser (#11013) 2024-04-18 16:37:52 +00:00
Dhruv Manilawala
b7066e64e7 Consider binary expr for parenthesized with items parsing (#11012)
## Summary

This PR fixes the bug in with items parsing where it would fail to
recognize that the parenthesized expression is part of a large binary
expression.

## Test Plan

Add test cases and verified the snapshots.
2024-04-18 21:39:30 +05:30
Dhruv Manilawala
6c4d779140 Consider if expression for parenthesized with items parsing (#11010)
## Summary

This PR fixes the bug in parenthesized with items parsing where the `if`
expression would result into a syntax error.

The reason being that once we identify that the ambiguous left
parenthesis belongs to the context expression, the parser converts the
parsed with item into an equivalent expression. Then, the parser
continuous to parse any postfix expressions. Now, attribute, subscript,
and call are taken into account as they're grouped in
`parse_postfix_expression` but `if` expression has it's own parsing
function.

Use `parse_if_expression` once all postfix expressions have been parsed.
Ideally, I think that `if` could be included in postfix expression
parsing as they can be chained as well (`x if True else y if True else
z`).

## Test Plan

Add test cases and verified the snapshots.
2024-04-18 14:30:15 +00:00
Dhruv Manilawala
8020d486f6 Reset FOR_TARGET context for all kinds of parentheses (#11009)
## Summary

This PR fixes a bug in the new parser which involves the parser context
w.r.t. for statement. This is specifically around the `in` keyword which
can be present in the target expression and shouldn't be considered to
be part of the `for` statement header. Ideally it should use a context
which is passed between functions, thus using a call stack to set /
unset a specific variant which will be done in a follow-up PR as it
requires some amount of refactor.

## Test Plan

Add test cases and update the snapshots.
2024-04-18 19:37:50 +05:30
Dhruv Manilawala
13ffb5bc19 Replace LALRPOP parser with hand-written parser (#10036)
(Supersedes #9152, authored by @LaBatata101)

## Summary

This PR replaces the current parser generated from LALRPOP to a
hand-written recursive descent parser.

It also updates the grammar for [PEP
646](https://peps.python.org/pep-0646/) so that the parser outputs the
correct AST. For example, in `data[*x]`, the index expression is now a
tuple with a single starred expression instead of just a starred
expression.

Beyond the performance improvements, the parser is also error resilient
and can provide better error messages. The behavior as seen by any
downstream tools isn't changed. That is, the linter and formatter can
still assume that the parser will _stop_ at the first syntax error. This
will be updated in the following months.

For more details about the change here, refer to the PR corresponding to
the individual commits and the release blog post.

## Test Plan

Write _lots_ and _lots_ of tests for both valid and invalid syntax and
verify the output.

## Acknowledgements

- @MichaReiser for reviewing 100+ parser PRs and continuously providing
guidance throughout the project
- @LaBatata101 for initiating the transition to a hand-written parser in
#9152
- @addisoncrump for implementing the fuzzer which helped
[catch](https://github.com/astral-sh/ruff/pull/10903)
[a](https://github.com/astral-sh/ruff/pull/10910)
[lot](https://github.com/astral-sh/ruff/pull/10966)
[of](https://github.com/astral-sh/ruff/pull/10896)
[bugs](https://github.com/astral-sh/ruff/pull/10877)

---------

Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-04-18 17:57:39 +05:30
Alex Waygood
e09180b1df Rename SemanticModel::is_builtin to SemanticModel::has_builtin_binding (#10991) 2024-04-18 11:11:42 +01:00
Jane Lewis
2cc487eb22 ruff server: Introduce settings for directly configuring the linter and formatter (#10984)
## Summary

The following client settings have been introduced to the language
server:
* `lint.preview`
* `format.preview`
* `lint.select`
* `lint.extendSelect`
* `lint.ignore`
* `exclude`
* `lineLength`

`exclude` and `lineLength` apply to both the linter and formatter.

This does not actually use the settings yet, but makes them available
for future use.

## Test Plan

Snapshot tests have been updated.
2024-04-18 07:53:48 +00:00
Jane Lewis
5da7299b32 ruff server: Write a setup guide for Neovim (#10987)
## Summary

A setup guide has been written for NeoVim under a new
`crates/ruff_server/docs/setup` folder, where future setup guides will
also go. This setup guide was adapted from the [`ruff-lsp`
guide](https://github.com/astral-sh/ruff-lsp?tab=readme-ov-file#example-neovim).

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-04-18 02:46:30 +00:00
Charlie Marsh
4d8890eef5 [pylint] Omit stubs from invalid-bool and invalid-str-return-type (#11008)
## Summary

Reflecting some improvements that were made in
https://github.com/astral-sh/ruff/pull/10959.
2024-04-18 01:57:20 +00:00
Tibor Reiss
9f01ac3f87 [pylint] Implement invalid-length-returned (E0303) (#10963)
Add pylint rule invalid-length-returned (PLE0303)

See https://github.com/astral-sh/ruff/issues/970 for rules

Test Plan: `cargo test`

TBD: from the description: "Strictly speaking `bool` is a subclass of
`int`, thus returning `True`/`False` is valid. To be consistent with
other rules (e.g.
[PLE0305](https://github.com/astral-sh/ruff/pull/10962)
invalid-index-returned), ruff will raise, compared to pylint which will
not raise."
2024-04-18 01:54:52 +00:00
Charlie Marsh
b23414e3cc Resolve classes and functions relative to script name (#10965)
## Summary

If the user is analyzing a script (i.e., we have no module path), it
seems reasonable to use the script name when trying to identify paths to
objects defined _within_ the script.

Closes https://github.com/astral-sh/ruff/issues/10960.

## Test Plan

Ran:

```shell
check --isolated --select=B008 \
    --config 'lint.flake8-bugbear.extend-immutable-calls=["test.A"]' \
    test.py
```

On:

```python
class A: pass

def f(a=A()):
    pass
```
2024-04-18 01:42:50 +00:00
Tibor Reiss
1480d72643 [pylint] Implement invalid-bytes-returned (E0308) (#10959)
Add pylint rule invalid-bytes-returned (PLE0308)

See https://github.com/astral-sh/ruff/issues/970 for rules

Test Plan: `cargo test`
2024-04-18 01:38:14 +00:00
Charlie Marsh
06b3e376ac Improve documentation for block comment rules (#11007)
Closes https://github.com/astral-sh/ruff/issues/10632.
2024-04-18 01:22:19 +00:00
Charlie Marsh
e8b1125b30 [flake8-slots] Respect same-file Enum subclasses (#11006)
Closes https://github.com/astral-sh/ruff/issues/9890.
2024-04-17 21:15:52 -04:00
Zanie Blue
16cc9bd78d Improve display of rules in --show-settings (#11003)
Closes https://github.com/astral-sh/ruff/issues/11002
2024-04-17 20:18:41 +00:00
Jane Lewis
0a6327418d ruff server refreshes diagnostics for open files when file configuration is changed (#10988)
## Summary

The server now requests a [workspace diagnostic
refresh](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#diagnostic_refresh)
when a configuration file gets changed. This means that diagnostics for
all open files will be automatically re-requested by the client on a
config change.

## Test Plan

You can test this by opening several files in VS Code, setting `select`
in your file configuration to `[]`, and observing that the diagnostics
go away once the file is saved (besides any `Pylance` diagnostics).
Restore it to what it was before, and you should see the diagnostics
automatically return once a save happens.
2024-04-17 09:14:45 -07:00
Sigurd Spieckermann
518b29a9ef Add RUFF_OUTPUT_FILE environment variable support (#10992)
## Summary

I've added support for configuring the `ruff check` output file via the
environment variable `RUFF_OUTPUT_FILE` akin to #1731.

This is super useful when, e.g., generating a [GitLab code quality
report](https://docs.gitlab.com/ee/ci/testing/code_quality.html#implement-a-custom-tool)
while running Ruff as a pre-commit hook. Usually, `ruff check` should
print its human-readable output to `stdout`, but when run through
`pre-commit` _in a GitLab CI job_ it should write its output in `gitlab`
format to a file. So, to override these two settings only during CI,
environment variables come handy, and `RUFF_OUTPUT_FORMAT` already
exists but `RUFF_OUTPUT_FILE` has been missing.

A (simplified) GitLab CI job config for this scenario might look like
this:

```yaml
pre-commit:
  stage: test
  image: python
  variables:
    RUFF_OUTPUT_FILE: gl-code-quality-report.json
    RUFF_OUTPUT_FORMAT: gitlab
  before_script:
    - pip install pre-commit
  script:
    - pre-commit run --all-files --show-diff-on-failure
  artifacts:
    reports:
      codequality: gl-code-quality-report.json
```

## Test Plan

I tested it manually.
2024-04-17 11:42:45 -04:00
Alex Waygood
caae8d2c68 Optimise SemanticModel::match_builtin_expr (#11001) 2024-04-17 15:54:41 +01:00
Philipp Thiel
2971655b28 [flake8-bugbear] Treat raise NotImplemented-only bodies as stub functions (#10990)
## Summary

As discussed in
https://github.com/astral-sh/ruff/issues/10083#issuecomment-1969653610,
stubs detection now also covers the case where the function body raises
NotImplementedError and does nothing else.

## Test Plan

Tests for the relevant cases were added in B006_8.py
2024-04-17 14:06:40 +00:00
Alex Waygood
f48a794125 Change more usages of SemanticModel::is_builtin to use resolve_builtin_symbol or match_builtin_expr (#10982)
## Summary

This PR switches more callsites of `SemanticModel::is_builtin` to move
over to the new methods I introduced in #10919, which are more concise
and more accurate. I missed these calls in the first PR.
2024-04-17 07:50:10 +01:00
Jane Lewis
2882604451 ruff server: Important errors are now shown as popups (#10951)
## Summary

Fixes #10866.

Introduces the `show_err_msg!` macro which will send a message to be
shown as a popup to the client via the `window/showMessage` LSP method.

## Test Plan

Insert various `show_err_msg!` calls in common code paths (for example,
at the beginning of `event_loop`) and confirm that these messages appear
in your editor.

To test that panicking works correctly, add this to the top of the `fn
run` definition in
`crates/ruff_server/src/server/api/requests/execute_command.rs`:

```rust
panic!("This should appear");
```

Then, try running a command like `Ruff: Format document` from the
command palette (`Ctrl/Cmd+Shift+P`). You should see the following
messages appear:


![Screenshot 2024-04-16 at 11 20
57 AM](https://github.com/astral-sh/ruff/assets/19577865/ae430da6-82c3-4841-a419-664ff34034e8)
2024-04-16 18:32:53 +00:00
Jane Lewis
eab3c4e334 Enable ruff-specific source actions (#10916)
## Summary

Fixes #10780.

The server now send code actions to the client with a Ruff-specific
kind, `source.*.ruff`. The kind filtering logic has also been reworked
to support this.

## Test Plan

Add this to your `settings.json` in VS Code:

```json
{
  "[python]": {
    "editor.codeActionsOnSave": {
      "source.organizeImports.ruff": "explicit",
    },
  }
}
```

Imports should be automatically organized when you manually save with
`Ctrl/Cmd+S`.
2024-04-16 18:21:08 +00:00
Jane Lewis
cffc55576f ruff server: Resolve configuration for each document individually (#10950)
## Summary

Configuration is no longer the property of a workspace but rather of
individual documents. Just like the Ruff CLI, each document is
configured based on the 'nearest' project configuration. See [the Ruff
documentation](https://docs.astral.sh/ruff/configuration/#config-file-discovery)
for more details.

To reduce the amount of times we resolve configuration for a file, we
have an index for each workspace that stores a reference-counted pointer
to a configuration for a given folder. If another file in the same
folder is opened, the configuration is simply re-used rather than us
re-resolving it.

## Guide for reviewing

The first commit is just the restructuring work, which adds some noise
to the diff. If you want to quickly understand what's actually changed,
I recommend looking at the two commits that come after it.
f7c073d441 makes configuration a property
of `DocumentController`/`DocumentRef`, moving it out of `Workspace`, and
it also sets up the `ConfigurationIndex`, though it doesn't implement
its key function, `get_or_insert`. In the commit after it,
fc35618f17, we implement `get_or_insert`.

## Test Plan

The best way to test this would be to ensure that the behavior matches
the Ruff CLI. Open a project with multiple configuration files (or add
them yourself), and then introduce problems in certain files that won't
show due to their configuration. Add those same problems to a section of
the project where those rules are run. Confirm that the lint rules are
run as expected with `ruff check`. Then, open your editor and confirm
that the diagnostics shown match the CLI output.

As an example - I have a workspace with two separate folders, `pandas`
and `scipy`. I created a `pyproject.toml` file in `pandas/pandas/io` and
a `ruff.toml` file in `pandas/pandas/api`. I changed the `select` and
`preview` settings in the sub-folder configuration files and confirmed
that these were reflected in the diagnostics. I also confirmed that this
did not change the diagnostics for the `scipy` folder whatsoever.
2024-04-16 18:15:02 +00:00
Alex Waygood
4284e079b5 Improve inference capabilities of the BuiltinTypeChecker (#10976) 2024-04-16 18:53:22 +01:00
plredmond
65edbfe62f Detect unneeded async keywords on functions (#9966)
## Summary

This change adds a rule to detect functions declared `async` but lacking
any of `await`, `async with`, or `async for`. This resolves #9951.

## Test Plan

This change was tested by following
https://docs.astral.sh/ruff/contributing/#rule-testing-fixtures-and-snapshots
and adding positive and negative cases for each of `await` vs nothing,
`async with` vs `with`, and `async for` vs `for`.
2024-04-16 10:32:29 -07:00
Max Muoto
45db695c47 Fix Typo for extend-aliases Option (#10978) 2024-04-16 12:23:09 -04:00
Micha Reiser
1801798e85 Bump the size of RuleSet (#10972) 2024-04-16 14:20:46 +02:00
Micha Reiser
d4e140d47f perf: RuleTable::any_enabled (#10971) 2024-04-16 12:20:27 +00:00
Alex Waygood
f779babc5f Improve handling of builtin symbols in linter rules (#10919)
Add a new method to the semantic model to simplify and improve the correctness of a common pattern
2024-04-16 11:37:31 +01:00
Charlie Marsh
effd5188c9 [flake8-bandit] Allow urllib.request.urlopen calls with static Request argument (#10964)
## Summary

Allows, e.g.:

```python
import urllib

urllib.request.urlopen(urllib.request.Request("https://example.com/"))
```

...in
[`suspicious-url-open-usage`](https://docs.astral.sh/ruff/rules/suspicious-url-open-usage/).

See:
https://github.com/astral-sh/ruff/issues/7918#issuecomment-2057661054
2024-04-16 02:30:23 +00:00
renovate[bot]
6dccbd2b58 Update NPM Development dependencies to v7.7.0 (#10958) 2024-04-15 17:38:45 +00:00
Alex Waygood
0c8ba32819 Minor improvements to renovate config (#10957) 2024-04-15 17:33:06 +00:00
renovate[bot]
4ac523c19d fix(deps): update dependency react-resizable-panels to v2.0.17 (#10956)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-15 19:30:09 +02:00
StevenMia
f9214f95bb chore: remove repetitive words (#10952) 2024-04-15 12:57:48 +00:00
renovate[bot]
49d9ad4c7e Update Rust crate chrono to v0.4.38 (#10953)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-15 14:50:22 +02:00
Steve C
c2210359e7 [pylint] Implement self-cls-assignment (W0642) (#9267)
## Summary

This PR implements [`W0642`/`self-cls-assignment`](https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/self-cls-assignment.html)

See: #970 

## Test Plan

Add test cases and verified the updated snapshots.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-04-15 09:06:01 +00:00
Hoël Bagard
670d66f54c [pycodestyle] Do not trigger E3 rules on defs following a function/method with a dummy body (#10704) 2024-04-15 10:23:49 +02:00
renovate[bot]
cbd500141f Update NPM Development dependencies (#10947)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-15 09:03:19 +02:00
Micha Reiser
b62aeb39d2 Configure Renovate to ignore ESLint 9 (#10946) 2024-04-15 06:52:43 +00:00
renovate[bot]
cdbd754870 Update Rust crate syn to v2.0.59 (#10941) 2024-04-15 06:38:18 +01:00
renovate[bot]
91efca1837 Update Rust crate codspeed-criterion-compat to v2.4.1 (#10931) 2024-04-14 21:49:31 -04:00
renovate[bot]
09ae2341e9 Update pre-commit dependencies (#10934) 2024-04-14 21:49:11 -04:00
renovate[bot]
f07af6fb63 Update Rust crate insta-cmd to 0.6.0 (#10937) 2024-04-14 21:48:46 -04:00
renovate[bot]
b11d17f65c Update Rust crate pep440_rs to 0.6.0 (#10938) 2024-04-14 21:48:37 -04:00
renovate[bot]
b4c7c55ddd Update Rust crate argfile to 0.2.0 (#10936) 2024-04-14 21:48:30 -04:00
renovate[bot]
0f01713257 Update Rust crate proc-macro2 to v1.0.80 (#10932) 2024-04-14 21:48:15 -04:00
renovate[bot]
ed9a92d915 Update Rust crate quote to v1.0.36 (#10933) 2024-04-14 21:48:09 -04:00
renovate[bot]
6da4ea6116 Update Rust crate anyhow to v1.0.82 (#10930) 2024-04-14 21:47:54 -04:00
Dhruv Manilawala
f9a828f493 Move Q003 to AST checker (#10923)
## Summary

This PR moves the `Q003` rule to AST checker.

This is the final rule that used the docstring detection state machine
and thus this PR removes it as well.

resolves: #7595 
resolves: #7808 

## Test Plan

- [x] `cargo test`
- [x] Make sure there are no changes in the ecosystem
2024-04-14 23:44:12 +05:30
Steve C
812b0976a9 [pylint] Support inverted comparisons (PLR1730) (#10920)
## Summary

Adds more aggressive logic to PLR1730, `if-stmt-min-max`

Closes #10907 

## Test Plan

`cargo test`

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-04-13 22:57:20 +00:00
Hoël Bagard
b356c4376c Fix S310 suspicious-url-open-usage description (#10917)
## Summary

The "What it does" section of the docstring is missing a verb, this PR
adds it.
2024-04-13 12:52:04 +01:00
Sebastian Pipping
85ca5b7eed Fix last example of flake8-bugbear rule B023 "function uses loop variable" (#10913)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Hi! 👋 

Thanks for sharing ruff as software libre — it helps me keep Python code
quality up with pre-commit, both locally and CI 🙏

While studying the examples at
https://docs.astral.sh/ruff/rules/function-uses-loop-variable/#example I
noticed that the last of the examples had a bug: prior to this fix, `ì`
was passed to the lambda for `x` rather than for `i` — the two are
mixed-up. The reason it's easy to overlook is because addition is an
commutative operation and so `x + i` and `i + x` give the same result
(and least with integers), despite the mix-up. For proof, let me demo
the relevant part with before and after:

```python
In [1]: from functools import partial

In [2]: [partial(lambda x, i: (x, i), i)(123) for i in range(3)]
Out[2]: [(0, 123), (1, 123), (2, 123)]

In [3]: [partial(lambda x, i: (x, i), i=i)(123) for i in range(3)]
Out[3]: [(123, 0), (123, 1), (123, 2)]
```

Does that make sense?

## Test Plan

<!-- How was it tested? -->
Was manually tested using IPython.


CC @r4f @grandchild
2024-04-12 20:07:52 +00:00
Charlie Marsh
c2421068bc Limit commutative non-augmented-assignments to primitive data types (#10912)
## Summary

I think this is the best we can do without type inference. At least it
will still catch some common cases.

Closes #10911.
2024-04-12 15:02:29 -04:00
Charlie Marsh
e9870fe468 Avoid non-augmented-assignment for reversed, non-commutative operators (#10909)
Closes https://github.com/astral-sh/ruff/issues/10900.
2024-04-12 10:04:57 -04:00
Charlie Marsh
a013050c11 Respect per-file-ignores for RUF100 on blanket # noqa (#10908)
## Summary

If `RUF100` was included in a per-file-ignore, we respected it on cases
like `# noqa: F401`, but not the blanket variant (`# noqa`).

Closes https://github.com/astral-sh/ruff/issues/10906.
2024-04-12 13:45:29 +00:00
Dhruv Manilawala
2e37cf6b3b Bump version to v0.3.7 (#10895) 2024-04-12 03:39:45 +00:00
wolfgangshi
a9e4393008 [pylint] Implement rule to prefer augmented assignment (PLR6104) (#9932)
## Summary

Implement new rule: Prefer augmented assignment (#8877). It checks for
the assignment statement with the form of `<expr> = <expr>
<binary-operator> …` with a unsafe fix to use augmented assignment
instead.

## Test Plan

1. Snapshot test is included in the PR.
2. Manually test with playground.
2024-04-11 23:08:42 -04:00
Charlie Marsh
312f43475f [pylint] Recode nan-comparison rule to W0177 (#10894)
## Summary

This was accidentally committed under `W0117`, but the actual Pylint
code is `W0177`:
https://pylint.readthedocs.io/en/latest/user_guide/checkers/features.html.

Closes https://github.com/astral-sh/ruff/issues/10791.
2024-04-11 22:49:20 -04:00
Carl Meyer
563daa8a86 Fix docs and add overlap test for negated per-file-ignores (#10863)
Refs #3172 

## Summary

Fix a typo in the docs example, and add a test for the case where a
negative pattern and a positive pattern overlap.

The behavior here is simple: patterns (positive or negative) are always
additive if they hit (i.e. match for a positive pattern, don't match for
a negated pattern). We never "un-ignore" previously-ignored rules based
on a pattern (positive or negative) failing to hit.

It's simple enough that I don't really see other cases we need to add
tests for (the tests we have cover all branches in the ignores_from_path
function that implements the core logic), but open to reviewer feedback.

I also didn't end up changing the docs to explain this more, because I
think they are accurate as written and don't wrongly imply any more
complex behavior. Open to reviewer feedback on this as well!

After some discussion, I think allowing negative patterns to un-ignore
rules is too confusing and easy to get wrong; if we need that, we should
add `per-file-selects` instead.

## Test Plan

Test/docs only change; tests pass, docs render and look right.

---------

Co-authored-by: Alex Waygood <Alex.Waygood@gmail.com>
2024-04-11 19:30:28 -06:00
Carl Meyer
7ae15c6e0a Fix comment copy/paste typo in newtype_index (#10892)
## Summary

This comment looks wrongly copy-pasted from the comment above, and
mentions the wrong type.

## Test Plan

Comment-only change.
2024-04-11 18:43:52 -06:00
Martin Imre
03899dcba3 [flake8-bugbear] Implement loop-iterator-mutation (B909) (#9578)
## Summary
This PR adds the implementation for the current
[flake8-bugbear](https://github.com/PyCQA/flake8-bugbear)'s B038 rule.
The B038 rule checks for mutation of loop iterators in the body of a for
loop and alerts when found.

Rational: 
Editing the loop iterator can lead to undesired behavior and is probably
a bug in most cases.

Closes #9511.

Note there will be a second iteration of B038 implemented in
`flake8-bugbear` soon, and this PR currently only implements the weakest
form of the rule.
I'd be happy to also implement the further improvements to B038 here in
ruff 🙂
See https://github.com/PyCQA/flake8-bugbear/issues/454 for more
information on the planned improvements.

## Test Plan
Re-using the same test file that I've used for `flake8-bugbear`, which
is included in this PR (look for the `B038.py` file).


Note: this is my first time using `rust` (beside `rustlings`) - I'd be
very happy about thorough feedback on what I could've done better
🙂 - Bring it on 😀
2024-04-11 19:52:52 +00:00
Carl Meyer
25f5a8b201 Struct not tuple for compiled per-file ignores (#10864)
## Summary

Code cleanup for per-file ignores; use a struct instead of a tuple.

Named the structs for individual ignores and the list of ignores
`CompiledPerFileIgnore` and `CompiledPerFileIgnoreList`. Name choice is
because we already have a `PerFileIgnore` struct for a
pre-compiled-matchers form of the config. Name bikeshedding welcome.

## Test Plan

Refactor, should not change behavior; existing tests pass.

---------

Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2024-04-11 13:47:57 -06:00
Charlie Marsh
e7d1d43f39 [pylint] Reverse min-max logic in if-stmt-min-max (#10890)
Closes https://github.com/astral-sh/ruff/issues/10889.
2024-04-11 14:16:13 -04:00
Charlie Marsh
9b9098c3dc Downgrade ESLint to v8 (#10888)
## Summary

Some of our plugins aren't compatible with v9.

Originally shipped in #10827.

## Test Plan

- `npm install`
- `npm ci`
2024-04-11 17:23:46 +00:00
Charlie Marsh
0cc154c2a9 Avoid TOCTOU errors in cache initialization (#10884)
## Summary

I believe this should close
https://github.com/astral-sh/ruff/issues/10880? The `.gitignore`
creation seems ok, since it truncates, but using `cachedir::is_tagged`
followed by `cachedir::add_tag` is not safe, as `cachedir::add_tag`
_fails_ if the file already exists.

This also matches the structure of the code in `uv`.

Closes https://github.com/astral-sh/ruff/issues/10880.
2024-04-11 12:09:07 -04:00
1322 changed files with 137977 additions and 108042 deletions

2
.gitattributes vendored
View File

@@ -8,5 +8,7 @@ crates/ruff_linter/resources/test/fixtures/pycodestyle/W391_3.py text eol=crlf
crates/ruff_python_formatter/resources/test/fixtures/ruff/docstring_code_examples_crlf.py text eol=crlf
crates/ruff_python_formatter/tests/snapshots/format@docstring_code_examples_crlf.py.snap text eol=crlf
crates/ruff_python_parser/resources/inline linguist-generated=true
ruff.schema.json linguist-generated=true text=auto eol=lf
*.md.snap linguist-language=Markdown

8
.github/CODEOWNERS vendored
View File

@@ -5,11 +5,13 @@
# - The '*' pattern is global owners.
# - Order is important. The last matching pattern has the most precedence.
# Jupyter
/crates/ruff_linter/src/jupyter/ @dhruvmanila
/crates/ruff_notebook/ @dhruvmanila
/crates/ruff_formatter/ @MichaReiser
/crates/ruff_python_formatter/ @MichaReiser
/crates/ruff_python_parser/ @MichaReiser
/crates/ruff_python_parser/ @MichaReiser @dhruvmanila
# flake8-pyi
/crates/ruff_linter/src/rules/flake8_pyi/ @AlexWaygood
# Script for fuzzing the parser
/scripts/fuzz-parser/ @AlexWaygood

View File

@@ -4,7 +4,8 @@
suppressNotifications: ["prEditedNotification"],
extends: ["config:recommended"],
labels: ["internal"],
schedule: ["on Monday"],
schedule: ["before 4am on Monday"],
semanticCommits: "disabled",
separateMajorMinor: false,
prHourlyLimit: 10,
enabledManagers: ["github-actions", "pre-commit", "cargo", "pep621", "npm"],
@@ -52,6 +53,13 @@
matchPackagePatterns: ["strum"],
description: "Weekly update of strum dependencies",
},
{
groupName: "ESLint",
matchManagers: ["npm"],
matchPackageNames: ["eslint"],
allowedVersions: "<9",
description: "Constraint ESLint to version 8 until TypeScript-eslint supports ESLint 9", // https://github.com/typescript-eslint/typescript-eslint/issues/8211
},
],
vulnerabilityAlerts: {
commitMessageSuffix: "",

View File

@@ -23,6 +23,8 @@ jobs:
name: "Determine changes"
runs-on: ubuntu-latest
outputs:
# Flag that is raised when any code that affects parser is changed
parser: ${{ steps.changed.outputs.parser_any_changed }}
# Flag that is raised when any code that affects linter is changed
linter: ${{ steps.changed.outputs.linter_any_changed }}
# Flag that is raised when any code that affects formatter is changed
@@ -39,6 +41,17 @@ jobs:
id: changed
with:
files_yaml: |
parser:
- Cargo.toml
- Cargo.lock
- crates/ruff_python_trivia/**
- crates/ruff_source_file/**
- crates/ruff_text_size/**
- crates/ruff_python_ast/**
- crates/ruff_python_parser/**
- scripts/fuzz-parser/**
- .github/workflows/ci.yaml
linter:
- Cargo.toml
- Cargo.lock
@@ -181,6 +194,22 @@ jobs:
cd crates/ruff_wasm
wasm-pack test --node
cargo-build-release:
name: "cargo build (release)"
runs-on: macos-latest
needs: determine_changes
if: ${{ needs.determine_changes.outputs.code == 'true' || github.ref == 'refs/heads/main' }}
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- name: "Install Rust toolchain"
run: rustup show
- name: "Install mold"
uses: rui314/setup-mold@v1
- uses: Swatinem/rust-cache@v2
- name: "Build"
run: cargo build --release --locked
cargo-fuzz:
name: "cargo fuzz"
runs-on: ubuntu-latest
@@ -200,6 +229,38 @@ jobs:
tool: cargo-fuzz@0.11.2
- run: cargo fuzz build -s none
fuzz-parser:
name: "Fuzz the parser"
runs-on: ubuntu-latest
needs:
- cargo-test-linux
- determine_changes
if: ${{ needs.determine_changes.outputs.parser == 'true' }}
timeout-minutes: 20
env:
FORCE_COLOR: 1
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Install uv
run: curl -LsSf https://astral.sh/uv/install.sh | sh
- name: Install Python requirements
run: uv pip install -r scripts/fuzz-parser/requirements.txt --system
- uses: actions/download-artifact@v4
name: Download Ruff binary to test
id: download-cached-binary
with:
name: ruff
path: ruff-to-test
- name: Fuzz
run: |
# Make executable, since artifact download doesn't preserve this
chmod +x ${{ steps.download-cached-binary.outputs.download-path }}/ruff
python scripts/fuzz-parser/fuzz.py 0-500 --test-executable ${{ steps.download-cached-binary.outputs.download-path }}/ruff
scripts:
name: "test scripts"
runs-on: ubuntu-latest
@@ -228,9 +289,7 @@ jobs:
- determine_changes
# Only runs on pull requests, since that is the only we way we can find the base version for comparison.
# Ecosystem check needs linter and/or formatter changes.
if: github.event_name == 'pull_request' && ${{
needs.determine_changes.outputs.code == 'true'
}}
if: ${{ github.event_name == 'pull_request' && needs.determine_changes.outputs.code == 'true' }}
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
@@ -509,7 +568,7 @@ jobs:
benchmarks:
runs-on: ubuntu-latest
needs: determine_changes
if: ${{ needs.determine_changes.outputs.code == 'true' || github.ref == 'refs/heads/main' }}
if: ${{ github.repository == 'astral-sh/ruff' && (needs.determine_changes.outputs.code == 'true' || github.ref == 'refs/heads/main') }}
timeout-minutes: 20
steps:
- name: "Checkout Branch"
@@ -525,23 +584,8 @@ jobs:
- uses: Swatinem/rust-cache@v2
# Codspeed comes with a very ancient cargo version (1.66) that resolves features flags differently than what we use now.
# This can result in build failures; see https://github.com/astral-sh/ruff/pull/10700.
# There's a pending codspeed PR to upgrade to a newer cargo version, but until that's merged, we need to use the workaround below.
# https://github.com/CodSpeedHQ/codspeed-rust/pull/31
# What we do is to call cargo build manually with the correct feature flags and RUSTC settings. We'll have to
# manually maintain the list of benchmarks to run with codspeed (the benefit is that we could detect which benchmarks to run and build based on the changes).
# This is inspired by https://github.com/oxc-project/oxc/blob/a0532adc654039a0c7ead7b35216dfa0b0cb8e8f/.github/workflows/benchmark.yml
- name: "Build benchmarks"
env:
RUSTFLAGS: "-C debuginfo=2 -C strip=none -g --cfg codspeed"
shell: bash
# Build all benchmarks, copy the binary to the codspeed directory, remove any `*.d` files that might have been created.
run: |
cargo build --release -p ruff_benchmark --bench parser --bench linter --bench formatter --bench lexer --features=codspeed
mkdir -p ./target/codspeed/ruff_benchmark
cp ./target/release/deps/{lexer,parser,linter,formatter}* target/codspeed/ruff_benchmark/
rm -rf ./target/codspeed/ruff_benchmark/*.d
run: cargo codspeed build --features codspeed -p ruff_benchmark
- name: "Run benchmarks"
uses: CodSpeedHQ/action@v2

72
.github/workflows/daily_fuzz.yaml vendored Normal file
View File

@@ -0,0 +1,72 @@
name: Daily parser fuzz
on:
workflow_dispatch:
schedule:
- cron: "0 0 * * *"
pull_request:
paths:
- ".github/workflows/daily_fuzz.yaml"
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
CARGO_INCREMENTAL: 0
CARGO_NET_RETRY: 10
CARGO_TERM_COLOR: always
RUSTUP_MAX_RETRIES: 10
PACKAGE_NAME: ruff
FORCE_COLOR: 1
jobs:
fuzz:
name: Fuzz
runs-on: ubuntu-latest
timeout-minutes: 20
# Don't run the cron job on forks:
if: ${{ github.repository == 'astral-sh/ruff' || github.event_name != 'schedule' }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install uv
run: curl -LsSf https://astral.sh/uv/install.sh | sh
- name: Install Python requirements
run: uv pip install -r scripts/fuzz-parser/requirements.txt --system
- name: "Install Rust toolchain"
run: rustup show
- name: "Install mold"
uses: rui314/setup-mold@v1
- uses: Swatinem/rust-cache@v2
- name: Build ruff
# A debug build means the script runs slower once it gets started,
# but this is outweighed by the fact that a release build takes *much* longer to compile in CI
run: cargo build --locked
- name: Fuzz
run: python scripts/fuzz-parser/fuzz.py $(shuf -i 0-9999999999999999999 -n 1000) --test-executable target/debug/ruff
create-issue-on-failure:
name: Create an issue if the daily fuzz surfaced any bugs
runs-on: ubuntu-latest
needs: fuzz
if: ${{ github.repository == 'astral-sh/ruff' && always() && github.event_name == 'schedule' && needs.fuzz.result == 'failure' }}
permissions:
issues: write
steps:
- uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
await github.rest.issues.create({
owner: "astral-sh",
repo: "ruff",
title: `Daily parser fuzz failed on ${new Date().toDateString()}`,
body: "Runs listed here: https://github.com/astral-sh/ruff/actions/workflows/daily_fuzz.yml",
labels: ["bug", "parser", "fuzzer"],
})

View File

@@ -58,7 +58,7 @@ jobs:
path: dist
macos-x86_64:
runs-on: macos-latest
runs-on: macos-12
steps:
- uses: actions/checkout@v4
with:
@@ -97,8 +97,8 @@ jobs:
*.tar.gz
*.sha256
macos-universal:
runs-on: macos-latest
macos-aarch64:
runs-on: macos-14
steps:
- uses: actions/checkout@v4
with:
@@ -106,16 +106,17 @@ jobs:
- uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
architecture: x64
architecture: arm64
- name: "Prep README.md"
run: python scripts/transform_readme.py --target pypi
- name: "Build wheels - universal2"
- name: "Build wheels - aarch64"
uses: PyO3/maturin-action@v1
with:
args: --release --locked --target universal2-apple-darwin --out dist
- name: "Test wheel - universal2"
target: aarch64
args: --release --locked --out dist
- name: "Test wheel - aarch64"
run: |
pip install dist/${{ env.PACKAGE_NAME }}-*universal2.whl --force-reinstall
pip install dist/${{ env.PACKAGE_NAME }}-*.whl --force-reinstall
ruff --help
python -m ruff --help
- name: "Upload wheels"
@@ -451,7 +452,7 @@ jobs:
name: Upload to PyPI
runs-on: ubuntu-latest
needs:
- macos-universal
- macos-aarch64
- macos-x86_64
- windows
- linux

View File

@@ -41,7 +41,7 @@ repos:
)$
- repo: https://github.com/crate-ci/typos
rev: v1.20.4
rev: v1.20.10
hooks:
- id: typos
@@ -55,7 +55,7 @@ repos:
pass_filenames: false # This makes it a lot faster
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.5
rev: v0.4.2
hooks:
- id: ruff-format
- id: ruff

View File

@@ -1,5 +1,142 @@
# Changelog
## 0.4.2
### Rule changes
- \[`flake8-pyi`\] Allow for overloaded `__exit__` and `__aexit__` definitions (`PYI036`) ([#11057](https://github.com/astral-sh/ruff/pull/11057))
- \[`pyupgrade`\] Catch usages of `"%s" % var` and provide an unsafe fix (`UP031`) ([#11019](https://github.com/astral-sh/ruff/pull/11019))
- \[`refurb`\] Implement new rule that suggests min/max over `sorted()` (`FURB192`) ([#10868](https://github.com/astral-sh/ruff/pull/10868))
### Server
- Fix an issue with missing diagnostics for Neovim and Helix ([#11092](https://github.com/astral-sh/ruff/pull/11092))
- Implement hover documentation for `noqa` codes ([#11096](https://github.com/astral-sh/ruff/pull/11096))
- Introduce common Ruff configuration options with new server settings ([#11062](https://github.com/astral-sh/ruff/pull/11062))
### Bug fixes
- Use `macos-12` for building release wheels to enable macOS 11 compatibility ([#11146](https://github.com/astral-sh/ruff/pull/11146))
- \[`flake8-blind-expect`\] Allow raise from in `BLE001` ([#11131](https://github.com/astral-sh/ruff/pull/11131))
- \[`flake8-pyi`\] Allow simple assignments to `None` in enum class scopes (`PYI026`) ([#11128](https://github.com/astral-sh/ruff/pull/11128))
- \[`flake8-simplify`\] Avoid raising `SIM911` for non-`zip` attribute calls ([#11126](https://github.com/astral-sh/ruff/pull/11126))
- \[`refurb`\] Avoid `operator.itemgetter` suggestion for single-item tuple ([#11095](https://github.com/astral-sh/ruff/pull/11095))
- \[`ruff`\] Respect per-file-ignores for `RUF100` with no other diagnostics ([#11058](https://github.com/astral-sh/ruff/pull/11058))
- \[`ruff`\] Fix async comprehension false positive (`RUF029`) ([#11070](https://github.com/astral-sh/ruff/pull/11070))
### Documentation
- \[`flake8-bugbear`\] Document explicitly disabling strict zip (`B905`) ([#11040](https://github.com/astral-sh/ruff/pull/11040))
- \[`flake8-type-checking`\] Mention `lint.typing-modules` in `TCH001`, `TCH002`, and `TCH003` ([#11144](https://github.com/astral-sh/ruff/pull/11144))
- \[`isort`\] Improve documentation around custom `isort` sections ([#11050](https://github.com/astral-sh/ruff/pull/11050))
- \[`pylint`\] Fix documentation oversight for `invalid-X-returns` ([#11094](https://github.com/astral-sh/ruff/pull/11094))
### Performance
- Use `matchit` to resolve per-file settings ([#11111](https://github.com/astral-sh/ruff/pull/11111))
## 0.4.1
### Preview features
- \[`pylint`\] Implement `invalid-hash-returned` (`PLE0309`) ([#10961](https://github.com/astral-sh/ruff/pull/10961))
- \[`pylint`\] Implement `invalid-index-returned` (`PLE0305`) ([#10962](https://github.com/astral-sh/ruff/pull/10962))
### Bug fixes
- \[`pylint`\] Allow `NoReturn`-like functions for `__str__`, `__len__`, etc. (`PLE0307`) ([#11017](https://github.com/astral-sh/ruff/pull/11017))
- Parser: Use empty range when there's "gap" in token source ([#11032](https://github.com/astral-sh/ruff/pull/11032))
- \[`ruff`\] Ignore stub functions in `unused-async` (`RUF029`) ([#11026](https://github.com/astral-sh/ruff/pull/11026))
- Parser: Expect indented case block instead of match stmt ([#11033](https://github.com/astral-sh/ruff/pull/11033))
## 0.4.0
### A new, hand-written parser
Ruff's new parser is **>2x faster**, which translates to a **20-40% speedup** for all linting and formatting invocations.
There's a lot to say about this exciting change, so check out the [blog post](https://astral.sh/blog/ruff-v0.4.0) for more details!
See [#10036](https://github.com/astral-sh/ruff/pull/10036) for implementation details.
### A new language server in Rust
With this release, we also want to highlight our new language server. `ruff server` is a Rust-powered language
server that comes built-in with Ruff. It can be used with any editor that supports the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/) (LSP).
It uses a multi-threaded, lock-free architecture inspired by `rust-analyzer` and it will open the door for a lot
of exciting features. Its also faster than our previous [Python-based language server](https://github.com/astral-sh/ruff-lsp)
-- but you probably guessed that already.
`ruff server` is only in alpha, but it has a lot of features that you can try out today:
- Lints Python files automatically and shows quick-fixes when available
- Formats Python files, with support for range formatting
- Comes with commands for quickly performing actions: `ruff.applyAutofix`, `ruff.applyFormat`, and `ruff.applyOrganizeImports`
- Supports `source.fixAll` and `source.organizeImports` source actions
- Automatically reloads your project configuration when you change it
To setup `ruff server` with your editor, refer to the [README.md](https://github.com/astral-sh/ruff/blob/main/crates/ruff_server/README.md).
### Preview features
- \[`pycodestyle`\] Do not trigger `E3` rules on `def`s following a function/method with a dummy body ([#10704](https://github.com/astral-sh/ruff/pull/10704))
- \[`pylint`\] Implement `invalid-bytes-returned` (`E0308`) ([#10959](https://github.com/astral-sh/ruff/pull/10959))
- \[`pylint`\] Implement `invalid-length-returned` (`E0303`) ([#10963](https://github.com/astral-sh/ruff/pull/10963))
- \[`pylint`\] Implement `self-cls-assignment` (`W0642`) ([#9267](https://github.com/astral-sh/ruff/pull/9267))
- \[`pylint`\] Omit stubs from `invalid-bool` and `invalid-str-return-type` ([#11008](https://github.com/astral-sh/ruff/pull/11008))
- \[`ruff`\] New rule `unused-async` (`RUF029`) to detect unneeded `async` keywords on functions ([#9966](https://github.com/astral-sh/ruff/pull/9966))
### Rule changes
- \[`flake8-bandit`\] Allow `urllib.request.urlopen` calls with static `Request` argument (`S310`) ([#10964](https://github.com/astral-sh/ruff/pull/10964))
- \[`flake8-bugbear`\] Treat `raise NotImplemented`-only bodies as stub functions (`B006`) ([#10990](https://github.com/astral-sh/ruff/pull/10990))
- \[`flake8-slots`\] Respect same-file `Enum` subclasses (`SLOT000`) ([#11006](https://github.com/astral-sh/ruff/pull/11006))
- \[`pylint`\] Support inverted comparisons (`PLR1730`) ([#10920](https://github.com/astral-sh/ruff/pull/10920))
### Linter
- Improve handling of builtin symbols in linter rules ([#10919](https://github.com/astral-sh/ruff/pull/10919))
- Improve display of rules in `--show-settings` ([#11003](https://github.com/astral-sh/ruff/pull/11003))
- Improve inference capabilities of the `BuiltinTypeChecker` ([#10976](https://github.com/astral-sh/ruff/pull/10976))
- Resolve classes and functions relative to script name ([#10965](https://github.com/astral-sh/ruff/pull/10965))
- Improve performance of `RuleTable::any_enabled` ([#10971](https://github.com/astral-sh/ruff/pull/10971))
### Server
*This section is devoted to updates for our new language server, written in Rust.*
- Enable ruff-specific source actions ([#10916](https://github.com/astral-sh/ruff/pull/10916))
- Refreshes diagnostics for open files when file configuration is changed ([#10988](https://github.com/astral-sh/ruff/pull/10988))
- Important errors are now shown as popups ([#10951](https://github.com/astral-sh/ruff/pull/10951))
- Introduce settings for directly configuring the linter and formatter ([#10984](https://github.com/astral-sh/ruff/pull/10984))
- Resolve configuration for each document individually ([#10950](https://github.com/astral-sh/ruff/pull/10950))
- Write a setup guide for Neovim ([#10987](https://github.com/astral-sh/ruff/pull/10987))
### Configuration
- Add `RUFF_OUTPUT_FILE` environment variable support ([#10992](https://github.com/astral-sh/ruff/pull/10992))
### Bug fixes
- Avoid `non-augmented-assignment` for reversed, non-commutative operators (`PLR6104`) ([#10909](https://github.com/astral-sh/ruff/pull/10909))
- Limit commutative non-augmented-assignments to primitive data types (`PLR6104`) ([#10912](https://github.com/astral-sh/ruff/pull/10912))
- Respect `per-file-ignores` for `RUF100` on blanket `# noqa` ([#10908](https://github.com/astral-sh/ruff/pull/10908))
- Consider `if` expression for parenthesized with items parsing ([#11010](https://github.com/astral-sh/ruff/pull/11010))
- Consider binary expr for parenthesized with items parsing ([#11012](https://github.com/astral-sh/ruff/pull/11012))
- Reset `FOR_TARGET` context for all kinds of parentheses ([#11009](https://github.com/astral-sh/ruff/pull/11009))
## 0.3.7
### Preview features
- \[`flake8-bugbear`\] Implement `loop-iterator-mutation` (`B909`) ([#9578](https://github.com/astral-sh/ruff/pull/9578))
- \[`pylint`\] Implement rule to prefer augmented assignment (`PLR6104`) ([#9932](https://github.com/astral-sh/ruff/pull/9932))
### Bug fixes
- Avoid TOCTOU errors in cache initialization ([#10884](https://github.com/astral-sh/ruff/pull/10884))
- \[`pylint`\] Recode `nan-comparison` rule to `W0177` ([#10894](https://github.com/astral-sh/ruff/pull/10894))
- \[`pylint`\] Reverse min-max logic in `if-stmt-min-max` ([#10890](https://github.com/astral-sh/ruff/pull/10890))
## 0.3.6
### Preview features
@@ -1372,7 +1509,7 @@ Read Ruff's new [versioning policy](https://docs.astral.sh/ruff/versioning/).
- \[`refurb`\] Add `single-item-membership-test` (`FURB171`) ([#7815](https://github.com/astral-sh/ruff/pull/7815))
- \[`pylint`\] Add `and-or-ternary` (`R1706`) ([#7811](https://github.com/astral-sh/ruff/pull/7811))
_New rules are added in [preview](https://docs.astral.sh/ruff/preview/)._
*New rules are added in [preview](https://docs.astral.sh/ruff/preview/).*
### Configuration

684
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -15,7 +15,7 @@ license = "MIT"
aho-corasick = { version = "1.1.3" }
annotate-snippets = { version = "0.9.2", features = ["color"] }
anyhow = { version = "1.0.80" }
argfile = { version = "0.1.6" }
argfile = { version = "0.2.0" }
bincode = { version = "1.3.3" }
bitflags = { version = "2.5.0" }
bstr = { version = "1.9.1" }
@@ -24,13 +24,14 @@ chrono = { version = "0.4.35", default-features = false, features = ["clock"] }
clap = { version = "4.5.3", features = ["derive"] }
clap_complete_command = { version = "0.5.1" }
clearscreen = { version = "3.0.0" }
codspeed-criterion-compat = { version = "2.4.0", default-features = false }
codspeed-criterion-compat = { version = "2.6.0", default-features = false }
colored = { version = "2.1.0" }
console_error_panic_hook = { version = "0.1.7" }
console_log = { version = "1.0.0" }
countme = { version = "3.0.1" }
criterion = { version = "0.5.1", default-features = false }
crossbeam = { version = "0.8.4" }
crossbeam-channel = { version = "0.5.12" }
dashmap = { version = "5.5.3" }
dirs = { version = "5.0.0" }
drop_bomb = { version = "0.1.5" }
env_logger = { version = "0.11.0" }
@@ -39,26 +40,28 @@ filetime = { version = "0.2.23" }
fs-err = { version = "2.11.0" }
glob = { version = "0.3.1" }
globset = { version = "0.4.14" }
hashbrown = "0.14.3"
hexf-parse = { version = "0.2.1" }
ignore = { version = "0.4.22" }
imara-diff = { version = "0.1.5" }
imperative = { version = "1.0.4" }
indexmap = { version = "2.2.6" }
indicatif = { version = "0.17.8" }
indoc = { version = "2.0.4" }
insta = { version = "1.35.1", feature = ["filters", "glob"] }
insta-cmd = { version = "0.5.0" }
insta-cmd = { version = "0.6.0" }
is-macro = { version = "0.3.5" }
is-wsl = { version = "0.4.0" }
itertools = { version = "0.12.1" }
js-sys = { version = "0.3.69" }
jod-thread = { version = "0.1.2" }
lalrpop-util = { version = "0.20.0", default-features = false }
lexical-parse-float = { version = "0.8.0", features = ["format"] }
libc = { version = "0.2.153" }
libcst = { version = "1.1.0", default-features = false }
log = { version = "0.4.17" }
lsp-server = { version = "0.7.6" }
lsp-types = { version = "0.95.0", features = ["proposed"] }
matchit = { version = "0.8.1" }
memchr = { version = "2.7.1" }
mimalloc = { version = "0.1.39" }
natord = { version = "1.0.9" }
@@ -66,12 +69,14 @@ notify = { version = "6.1.1" }
num_cpus = { version = "1.16.0" }
once_cell = { version = "1.19.0" }
path-absolutize = { version = "3.1.1" }
path-slash = { version = "0.2.1" }
pathdiff = { version = "0.2.1" }
pep440_rs = { version = "0.5.0", features = ["serde"] }
parking_lot = "0.12.1"
pep440_rs = { version = "0.6.0", features = ["serde"] }
pretty_assertions = "1.3.0"
proc-macro2 = { version = "1.0.79" }
pyproject-toml = { version = "0.9.0" }
quick-junit = { version = "0.3.5" }
quick-junit = { version = "0.4.0" }
quote = { version = "1.0.23" }
rand = { version = "0.8.5" }
rayon = { version = "1.10.0" }

View File

@@ -4,7 +4,7 @@
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![image](https://img.shields.io/pypi/v/ruff.svg)](https://pypi.python.org/pypi/ruff)
[![image](https://img.shields.io/pypi/l/ruff.svg)](https://pypi.python.org/pypi/ruff)
[![image](https://img.shields.io/pypi/l/ruff.svg)](https://github.com/astral-sh/ruff/blob/main/LICENSE)
[![image](https://img.shields.io/pypi/pyversions/ruff.svg)](https://pypi.python.org/pypi/ruff)
[![Actions status](https://github.com/astral-sh/ruff/workflows/CI/badge.svg)](https://github.com/astral-sh/ruff/actions)
[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?logo=discord&logoColor=white)](https://discord.com/invite/astral-sh)
@@ -50,6 +50,7 @@ times faster than any individual tool.
Ruff is extremely actively developed and used in major open-source projects like:
- [Apache Airflow](https://github.com/apache/airflow)
- [Apache Superset](https://github.com/apache/superset)
- [FastAPI](https://github.com/tiangolo/fastapi)
- [Hugging Face](https://github.com/huggingface/transformers)
- [Pandas](https://github.com/pandas-dev/pandas)
@@ -151,7 +152,7 @@ Ruff can also be used as a [pre-commit](https://pre-commit.com/) hook via [`ruff
```yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.3.6
rev: v0.4.2
hooks:
# Run the linter.
- id: ruff
@@ -498,7 +499,7 @@ If you're using Ruff, consider adding the Ruff badge to your project's `README.m
## License
MIT
This repository is licensed under the [MIT License](https://github.com/astral-sh/ruff/blob/main/LICENSE)
<div align="center">
<a target="_blank" href="https://astral.sh" style="background:none">

View File

@@ -11,3 +11,9 @@ ned = "ned"
pn = "pn" # `import panel as pd` is a thing
poit = "poit"
BA = "BA" # acronym for "Bad Allowed", used in testing.
[default]
extend-ignore-re = [
# Line ignore with trailing "spellchecker:disable-line"
"(?Rm)^.*#\\s*spellchecker:disable-line$"
]

View File

@@ -0,0 +1,45 @@
[package]
name = "red_knot"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
homepage.workspace = true
documentation.workspace = true
repository.workspace = true
authors.workspace = true
license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
ruff_python_parser = { path = "../ruff_python_parser" }
ruff_python_ast = { path = "../ruff_python_ast" }
ruff_python_trivia = { path = "../ruff_python_trivia" }
ruff_text_size = { path = "../ruff_text_size" }
ruff_index = { path = "../ruff_index" }
ruff_notebook = { path = "../ruff_notebook" }
anyhow = { workspace = true }
bitflags = { workspace = true }
ctrlc = "3.4.4"
crossbeam-channel = { workspace = true }
dashmap = { workspace = true }
hashbrown = { workspace = true }
indexmap = { workspace = true }
log = { workspace = true }
notify = { workspace = true }
parking_lot = { workspace = true }
rayon = { workspace = true }
rustc-hash = { workspace = true }
smallvec = { workspace = true }
smol_str = "0.2.1"
tracing = { workspace = true }
tracing-subscriber = { workspace = true }
tracing-tree = { workspace = true }
[dev-dependencies]
textwrap = "0.16.1"
tempfile = { workspace = true }
[lints]
workspace = true

View File

@@ -0,0 +1,415 @@
use std::any::type_name;
use std::fmt::{Debug, Formatter};
use std::hash::{Hash, Hasher};
use std::marker::PhantomData;
use rustc_hash::FxHashMap;
use ruff_index::{Idx, IndexVec};
use ruff_python_ast::visitor::preorder;
use ruff_python_ast::visitor::preorder::{PreorderVisitor, TraversalSignal};
use ruff_python_ast::{
AnyNodeRef, AstNode, ExceptHandler, ExceptHandlerExceptHandler, Expr, MatchCase, ModModule,
NodeKind, Parameter, Stmt, StmtAnnAssign, StmtAssign, StmtAugAssign, StmtClassDef,
StmtFunctionDef, StmtGlobal, StmtImport, StmtImportFrom, StmtNonlocal, StmtTypeAlias,
TypeParam, TypeParamParamSpec, TypeParamTypeVar, TypeParamTypeVarTuple, WithItem,
};
use ruff_text_size::{Ranged, TextRange};
/// A type agnostic ID that uniquely identifies an AST node in a file.
#[ruff_index::newtype_index]
pub struct AstId;
/// A typed ID that uniquely identifies an AST node in a file.
///
/// This is different from [`AstId`] in that it is a combination of ID and the type of the node the ID identifies.
/// Typing the ID prevents mixing IDs of different node types and allows to restrict the API to only accept
/// nodes for which an ID has been created (not all AST nodes get an ID).
pub struct TypedAstId<N: HasAstId> {
erased: AstId,
_marker: PhantomData<fn() -> N>,
}
impl<N: HasAstId> TypedAstId<N> {
/// Upcasts this ID from a more specific node type to a more general node type.
pub fn upcast<M: HasAstId>(self) -> TypedAstId<M>
where
N: Into<M>,
{
TypedAstId {
erased: self.erased,
_marker: PhantomData,
}
}
}
impl<N: HasAstId> Copy for TypedAstId<N> {}
impl<N: HasAstId> Clone for TypedAstId<N> {
fn clone(&self) -> Self {
*self
}
}
impl<N: HasAstId> PartialEq for TypedAstId<N> {
fn eq(&self, other: &Self) -> bool {
self.erased == other.erased
}
}
impl<N: HasAstId> Eq for TypedAstId<N> {}
impl<N: HasAstId> Hash for TypedAstId<N> {
fn hash<H: Hasher>(&self, state: &mut H) {
self.erased.hash(state);
}
}
impl<N: HasAstId> Debug for TypedAstId<N> {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_tuple("TypedAstId")
.field(&self.erased)
.field(&type_name::<N>())
.finish()
}
}
pub struct AstIds {
ids: IndexVec<AstId, NodeKey>,
reverse: FxHashMap<NodeKey, AstId>,
}
impl AstIds {
// TODO rust analyzer doesn't allocate an ID for every node. It only allocates ids for
// nodes with a corresponding HIR element, that is nodes that are definitions.
pub fn from_module(module: &ModModule) -> Self {
let mut visitor = AstIdsVisitor::default();
// TODO: visit_module?
// Make sure we visit the root
visitor.create_id(module);
visitor.visit_body(&module.body);
while let Some(deferred) = visitor.deferred.pop() {
match deferred {
DeferredNode::FunctionDefinition(def) => {
def.visit_preorder(&mut visitor);
}
DeferredNode::ClassDefinition(def) => def.visit_preorder(&mut visitor),
}
}
AstIds {
ids: visitor.ids,
reverse: visitor.reverse,
}
}
/// Returns the ID to the root node.
pub fn root(&self) -> NodeKey {
self.ids[AstId::new(0)]
}
/// Returns the [`TypedAstId`] for a node.
pub fn ast_id<N: HasAstId>(&self, node: &N) -> TypedAstId<N> {
let key = node.syntax_node_key();
TypedAstId {
erased: self.reverse.get(&key).copied().unwrap(),
_marker: PhantomData,
}
}
/// Returns the [`TypedAstId`] for the node identified with the given [`TypedNodeKey`].
pub fn ast_id_for_key<N: HasAstId>(&self, node: &TypedNodeKey<N>) -> TypedAstId<N> {
let ast_id = self.ast_id_for_node_key(node.inner);
TypedAstId {
erased: ast_id,
_marker: PhantomData,
}
}
/// Returns the untyped [`AstId`] for the node identified by the given `node` key.
pub fn ast_id_for_node_key(&self, node: NodeKey) -> AstId {
self.reverse
.get(&node)
.copied()
.expect("Can't find node in AstIds map.")
}
/// Returns the [`TypedNodeKey`] for the node identified by the given [`TypedAstId`].
pub fn key<N: HasAstId>(&self, id: TypedAstId<N>) -> TypedNodeKey<N> {
let syntax_key = self.ids[id.erased];
TypedNodeKey::new(syntax_key).unwrap()
}
pub fn node_key<H: HasAstId>(&self, id: TypedAstId<H>) -> NodeKey {
self.ids[id.erased]
}
}
impl std::fmt::Debug for AstIds {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
let mut map = f.debug_map();
for (key, value) in self.ids.iter_enumerated() {
map.entry(&key, &value);
}
map.finish()
}
}
impl PartialEq for AstIds {
fn eq(&self, other: &Self) -> bool {
self.ids == other.ids
}
}
impl Eq for AstIds {}
#[derive(Default)]
struct AstIdsVisitor<'a> {
ids: IndexVec<AstId, NodeKey>,
reverse: FxHashMap<NodeKey, AstId>,
deferred: Vec<DeferredNode<'a>>,
}
impl<'a> AstIdsVisitor<'a> {
fn create_id<A: HasAstId>(&mut self, node: &A) {
let node_key = node.syntax_node_key();
let id = self.ids.push(node_key);
self.reverse.insert(node_key, id);
}
}
impl<'a> PreorderVisitor<'a> for AstIdsVisitor<'a> {
fn visit_stmt(&mut self, stmt: &'a Stmt) {
match stmt {
Stmt::FunctionDef(def) => {
self.create_id(def);
self.deferred.push(DeferredNode::FunctionDefinition(def));
return;
}
// TODO defer visiting the assignment body, type alias parameters etc?
Stmt::ClassDef(def) => {
self.create_id(def);
self.deferred.push(DeferredNode::ClassDefinition(def));
return;
}
Stmt::Expr(_) => {
// Skip
return;
}
Stmt::Return(_) => {}
Stmt::Delete(_) => {}
Stmt::Assign(assignment) => self.create_id(assignment),
Stmt::AugAssign(assignment) => {
self.create_id(assignment);
}
Stmt::AnnAssign(assignment) => self.create_id(assignment),
Stmt::TypeAlias(assignment) => self.create_id(assignment),
Stmt::For(_) => {}
Stmt::While(_) => {}
Stmt::If(_) => {}
Stmt::With(_) => {}
Stmt::Match(_) => {}
Stmt::Raise(_) => {}
Stmt::Try(_) => {}
Stmt::Assert(_) => {}
Stmt::Import(import) => self.create_id(import),
Stmt::ImportFrom(import_from) => self.create_id(import_from),
Stmt::Global(global) => self.create_id(global),
Stmt::Nonlocal(non_local) => self.create_id(non_local),
Stmt::Pass(_) => {}
Stmt::Break(_) => {}
Stmt::Continue(_) => {}
Stmt::IpyEscapeCommand(_) => {}
}
preorder::walk_stmt(self, stmt);
}
fn visit_expr(&mut self, _expr: &'a Expr) {}
fn visit_parameter(&mut self, parameter: &'a Parameter) {
self.create_id(parameter);
preorder::walk_parameter(self, parameter);
}
fn visit_except_handler(&mut self, except_handler: &'a ExceptHandler) {
match except_handler {
ExceptHandler::ExceptHandler(except_handler) => {
self.create_id(except_handler);
}
}
preorder::walk_except_handler(self, except_handler);
}
fn visit_with_item(&mut self, with_item: &'a WithItem) {
self.create_id(with_item);
preorder::walk_with_item(self, with_item);
}
fn visit_match_case(&mut self, match_case: &'a MatchCase) {
self.create_id(match_case);
preorder::walk_match_case(self, match_case);
}
fn visit_type_param(&mut self, type_param: &'a TypeParam) {
self.create_id(type_param);
}
}
enum DeferredNode<'a> {
FunctionDefinition(&'a StmtFunctionDef),
ClassDefinition(&'a StmtClassDef),
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct TypedNodeKey<N: AstNode> {
/// The type erased node key.
inner: NodeKey,
_marker: PhantomData<fn() -> N>,
}
impl<N: AstNode> TypedNodeKey<N> {
pub fn from_node(node: &N) -> Self {
let inner = NodeKey {
kind: node.as_any_node_ref().kind(),
range: node.range(),
};
Self {
inner,
_marker: PhantomData,
}
}
pub fn new(node_key: NodeKey) -> Option<Self> {
N::can_cast(node_key.kind).then_some(TypedNodeKey {
inner: node_key,
_marker: PhantomData,
})
}
pub fn resolve<'a>(&self, root: AnyNodeRef<'a>) -> Option<N::Ref<'a>> {
let node_ref = self.inner.resolve(root)?;
Some(N::cast_ref(node_ref).unwrap())
}
pub fn resolve_unwrap<'a>(&self, root: AnyNodeRef<'a>) -> N::Ref<'a> {
self.resolve(root).expect("node should resolve")
}
pub fn erased(&self) -> &NodeKey {
&self.inner
}
}
struct FindNodeKeyVisitor<'a> {
key: NodeKey,
result: Option<AnyNodeRef<'a>>,
}
impl<'a> PreorderVisitor<'a> for FindNodeKeyVisitor<'a> {
fn enter_node(&mut self, node: AnyNodeRef<'a>) -> TraversalSignal {
if self.result.is_some() {
return TraversalSignal::Skip;
}
if node.range() == self.key.range && node.kind() == self.key.kind {
self.result = Some(node);
TraversalSignal::Skip
} else if node.range().contains_range(self.key.range) {
TraversalSignal::Traverse
} else {
TraversalSignal::Skip
}
}
fn visit_body(&mut self, body: &'a [Stmt]) {
// TODO it would be more efficient to use binary search instead of linear
for stmt in body {
if stmt.range().start() > self.key.range.end() {
break;
}
self.visit_stmt(stmt);
}
}
}
// TODO an alternative to this is to have a `NodeId` on each node (in increasing order depending on the position).
// This would allow to reduce the size of this to a u32.
// What would be nice if we could use an `Arc::weak_ref` here but that only works if we use
// `Arc` internally
// TODO: Implement the logic to resolve a node, given a db (and the correct file).
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct NodeKey {
kind: NodeKind,
range: TextRange,
}
impl NodeKey {
pub fn resolve<'a>(&self, root: AnyNodeRef<'a>) -> Option<AnyNodeRef<'a>> {
// We need to do a binary search here. Only traverse into a node if the range is withint the node
let mut visitor = FindNodeKeyVisitor {
key: *self,
result: None,
};
if visitor.enter_node(root) == TraversalSignal::Traverse {
root.visit_preorder(&mut visitor);
}
visitor.result
}
}
/// Marker trait implemented by AST nodes for which we extract the `AstId`.
pub trait HasAstId: AstNode {
fn node_key(&self) -> TypedNodeKey<Self>
where
Self: Sized,
{
TypedNodeKey {
inner: self.syntax_node_key(),
_marker: PhantomData,
}
}
fn syntax_node_key(&self) -> NodeKey {
NodeKey {
kind: self.as_any_node_ref().kind(),
range: self.range(),
}
}
}
impl HasAstId for StmtFunctionDef {}
impl HasAstId for StmtClassDef {}
impl HasAstId for StmtAnnAssign {}
impl HasAstId for StmtAugAssign {}
impl HasAstId for StmtAssign {}
impl HasAstId for StmtTypeAlias {}
impl HasAstId for ModModule {}
impl HasAstId for StmtImport {}
impl HasAstId for StmtImportFrom {}
impl HasAstId for Parameter {}
impl HasAstId for TypeParam {}
impl HasAstId for Stmt {}
impl HasAstId for TypeParamTypeVar {}
impl HasAstId for TypeParamTypeVarTuple {}
impl HasAstId for TypeParamParamSpec {}
impl HasAstId for StmtGlobal {}
impl HasAstId for StmtNonlocal {}
impl HasAstId for ExceptHandlerExceptHandler {}
impl HasAstId for WithItem {}
impl HasAstId for MatchCase {}

View File

@@ -0,0 +1,158 @@
use std::fmt::Formatter;
use std::hash::Hash;
use std::sync::atomic::{AtomicUsize, Ordering};
use dashmap::mapref::entry::Entry;
use crate::FxDashMap;
/// Simple key value cache that locks on a per-key level.
pub struct KeyValueCache<K, V> {
map: FxDashMap<K, V>,
statistics: CacheStatistics,
}
impl<K, V> KeyValueCache<K, V>
where
K: Eq + Hash + Clone,
V: Clone,
{
pub fn try_get(&self, key: &K) -> Option<V> {
if let Some(existing) = self.map.get(key) {
self.statistics.hit();
Some(existing.clone())
} else {
self.statistics.miss();
None
}
}
pub fn get<F>(&self, key: &K, compute: F) -> V
where
F: FnOnce(&K) -> V,
{
match self.map.entry(key.clone()) {
Entry::Occupied(cached) => {
self.statistics.hit();
cached.get().clone()
}
Entry::Vacant(vacant) => {
self.statistics.miss();
let value = compute(key);
vacant.insert(value.clone());
value
}
}
}
pub fn set(&mut self, key: K, value: V) {
self.map.insert(key, value);
}
pub fn remove(&mut self, key: &K) -> Option<V> {
self.map.remove(key).map(|(_, value)| value)
}
pub fn clear(&mut self) {
self.map.clear();
self.map.shrink_to_fit();
}
pub fn statistics(&self) -> Option<Statistics> {
self.statistics.to_statistics()
}
}
impl<K, V> Default for KeyValueCache<K, V>
where
K: Eq + Hash,
V: Clone,
{
fn default() -> Self {
Self {
map: FxDashMap::default(),
statistics: CacheStatistics::default(),
}
}
}
impl<K, V> std::fmt::Debug for KeyValueCache<K, V>
where
K: std::fmt::Debug + Eq + Hash,
V: std::fmt::Debug,
{
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
let mut debug = f.debug_map();
for entry in &self.map {
debug.entry(&entry.value(), &entry.key());
}
debug.finish()
}
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Statistics {
pub hits: usize,
pub misses: usize,
}
impl Statistics {
#[allow(clippy::cast_precision_loss)]
pub fn hit_rate(&self) -> Option<f64> {
if self.hits + self.misses == 0 {
return None;
}
Some((self.hits as f64) / (self.hits + self.misses) as f64)
}
}
#[cfg(debug_assertions)]
pub type CacheStatistics = DebugStatistics;
#[cfg(not(debug_assertions))]
pub type CacheStatistics = ReleaseStatistics;
#[derive(Debug, Default)]
pub struct DebugStatistics {
hits: AtomicUsize,
misses: AtomicUsize,
}
impl DebugStatistics {
// TODO figure out appropriate Ordering
pub fn hit(&self) {
self.hits.fetch_add(1, Ordering::SeqCst);
}
pub fn miss(&self) {
self.misses.fetch_add(1, Ordering::SeqCst);
}
pub fn to_statistics(&self) -> Option<Statistics> {
let hits = self.hits.load(Ordering::SeqCst);
let misses = self.misses.load(Ordering::SeqCst);
Some(Statistics { hits, misses })
}
}
#[derive(Debug, Default)]
pub struct ReleaseStatistics;
impl ReleaseStatistics {
#[inline]
pub const fn hit(&self) {}
#[inline]
pub const fn miss(&self) {}
#[inline]
pub const fn to_statistics(&self) -> Option<Statistics> {
None
}
}

View File

@@ -0,0 +1,66 @@
use std::sync::{Arc, Condvar, Mutex};
#[derive(Debug, Default)]
pub struct CancellationTokenSource {
signal: Arc<(Mutex<bool>, Condvar)>,
}
impl CancellationTokenSource {
pub fn new() -> Self {
Self {
signal: Arc::new((Mutex::new(false), Condvar::default())),
}
}
#[tracing::instrument(level = "trace", skip_all)]
pub fn cancel(&self) {
let (cancelled, condvar) = &*self.signal;
let mut cancelled = cancelled.lock().unwrap();
if *cancelled {
return;
}
*cancelled = true;
condvar.notify_all();
}
pub fn is_cancelled(&self) -> bool {
let (cancelled, _) = &*self.signal;
*cancelled.lock().unwrap()
}
pub fn token(&self) -> CancellationToken {
CancellationToken {
signal: self.signal.clone(),
}
}
}
#[derive(Clone, Debug)]
pub struct CancellationToken {
signal: Arc<(Mutex<bool>, Condvar)>,
}
impl CancellationToken {
/// Returns `true` if cancellation has been requested.
pub fn is_cancelled(&self) -> bool {
let (cancelled, _) = &*self.signal;
*cancelled.lock().unwrap()
}
pub fn wait(&self) {
let (bool, condvar) = &*self.signal;
let lock = condvar
.wait_while(bool.lock().unwrap(), |bool| !*bool)
.unwrap();
debug_assert!(*lock);
drop(lock);
}
}

186
crates/red_knot/src/db.rs Normal file
View File

@@ -0,0 +1,186 @@
use std::path::Path;
use std::sync::Arc;
use crate::files::FileId;
use crate::lint::{Diagnostics, LintSemanticStorage, LintSyntaxStorage};
use crate::module::{Module, ModuleData, ModuleName, ModuleResolver, ModuleSearchPath};
use crate::parse::{Parsed, ParsedStorage};
use crate::source::{Source, SourceStorage};
use crate::symbols::{SymbolId, SymbolTable, SymbolTablesStorage};
use crate::types::{Type, TypeStore};
pub trait SourceDb {
// queries
fn file_id(&self, path: &std::path::Path) -> FileId;
fn file_path(&self, file_id: FileId) -> Arc<std::path::Path>;
fn source(&self, file_id: FileId) -> Source;
fn parse(&self, file_id: FileId) -> Parsed;
fn lint_syntax(&self, file_id: FileId) -> Diagnostics;
}
pub trait SemanticDb: SourceDb {
// queries
fn resolve_module(&self, name: ModuleName) -> Option<Module>;
fn file_to_module(&self, file_id: FileId) -> Option<Module>;
fn path_to_module(&self, path: &Path) -> Option<Module>;
fn symbol_table(&self, file_id: FileId) -> Arc<SymbolTable>;
fn infer_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> Type;
fn lint_semantic(&self, file_id: FileId) -> Diagnostics;
// mutations
fn add_module(&mut self, path: &Path) -> Option<(Module, Vec<Arc<ModuleData>>)>;
fn set_module_search_paths(&mut self, paths: Vec<ModuleSearchPath>);
}
pub trait Db: SemanticDb {}
#[derive(Debug, Default)]
pub struct SourceJar {
pub sources: SourceStorage,
pub parsed: ParsedStorage,
pub lint_syntax: LintSyntaxStorage,
}
#[derive(Debug, Default)]
pub struct SemanticJar {
pub module_resolver: ModuleResolver,
pub symbol_tables: SymbolTablesStorage,
pub type_store: TypeStore,
pub lint_semantic: LintSemanticStorage,
}
/// Gives access to a specific jar in the database.
///
/// Nope, the terminology isn't borrowed from Java but from Salsa <https://salsa-rs.github.io/salsa/>,
/// which is an analogy to storing the salsa in different jars.
///
/// The basic idea is that each crate can define its own jar and the jars can be combined to a single
/// database in the top level crate. Each crate also defines its own `Database` trait. The combination of
/// `Database` trait and the jar allows to write queries in isolation without having to know how they get composed at the upper levels.
///
/// Salsa further defines a `HasIngredient` trait which slices the jar to a specific storage (e.g. a specific cache).
/// We don't need this just jet because we write our queries by hand. We may want a similar trait if we decide
/// to use a macro to generate the queries.
pub trait HasJar<T> {
/// Gives a read-only reference to the jar.
fn jar(&self) -> &T;
/// Gives a mutable reference to the jar.
fn jar_mut(&mut self) -> &mut T;
}
#[cfg(test)]
pub(crate) mod tests {
use std::path::Path;
use std::sync::Arc;
use crate::db::{HasJar, SourceDb, SourceJar};
use crate::files::{FileId, Files};
use crate::lint::{lint_semantic, lint_syntax, Diagnostics};
use crate::module::{
add_module, file_to_module, path_to_module, resolve_module, set_module_search_paths,
Module, ModuleData, ModuleName, ModuleSearchPath,
};
use crate::parse::{parse, Parsed};
use crate::source::{source_text, Source};
use crate::symbols::{symbol_table, SymbolId, SymbolTable};
use crate::types::{infer_symbol_type, Type};
use super::{SemanticDb, SemanticJar};
// This can be a partial database used in a single crate for testing.
// It would hold fewer data than the full database.
#[derive(Debug, Default)]
pub(crate) struct TestDb {
files: Files,
source: SourceJar,
semantic: SemanticJar,
}
impl HasJar<SourceJar> for TestDb {
fn jar(&self) -> &SourceJar {
&self.source
}
fn jar_mut(&mut self) -> &mut SourceJar {
&mut self.source
}
}
impl HasJar<SemanticJar> for TestDb {
fn jar(&self) -> &SemanticJar {
&self.semantic
}
fn jar_mut(&mut self) -> &mut SemanticJar {
&mut self.semantic
}
}
impl SourceDb for TestDb {
fn file_id(&self, path: &Path) -> FileId {
self.files.intern(path)
}
fn file_path(&self, file_id: FileId) -> Arc<Path> {
self.files.path(file_id)
}
fn source(&self, file_id: FileId) -> Source {
source_text(self, file_id)
}
fn parse(&self, file_id: FileId) -> Parsed {
parse(self, file_id)
}
fn lint_syntax(&self, file_id: FileId) -> Diagnostics {
lint_syntax(self, file_id)
}
}
impl SemanticDb for TestDb {
fn resolve_module(&self, name: ModuleName) -> Option<Module> {
resolve_module(self, name)
}
fn file_to_module(&self, file_id: FileId) -> Option<Module> {
file_to_module(self, file_id)
}
fn path_to_module(&self, path: &Path) -> Option<Module> {
path_to_module(self, path)
}
fn symbol_table(&self, file_id: FileId) -> Arc<SymbolTable> {
symbol_table(self, file_id)
}
fn infer_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> Type {
infer_symbol_type(self, file_id, symbol_id)
}
fn lint_semantic(&self, file_id: FileId) -> Diagnostics {
lint_semantic(self, file_id)
}
fn add_module(&mut self, path: &Path) -> Option<(Module, Vec<Arc<ModuleData>>)> {
add_module(self, path)
}
fn set_module_search_paths(&mut self, paths: Vec<ModuleSearchPath>) {
set_module_search_paths(self, paths);
}
}
}

View File

@@ -0,0 +1,148 @@
use std::fmt::{Debug, Formatter};
use std::hash::{Hash, Hasher};
use std::path::Path;
use std::sync::Arc;
use hashbrown::hash_map::RawEntryMut;
use parking_lot::RwLock;
use rustc_hash::FxHasher;
use ruff_index::{newtype_index, IndexVec};
type Map<K, V> = hashbrown::HashMap<K, V, ()>;
#[newtype_index]
pub struct FileId;
// TODO we'll need a higher level virtual file system abstraction that allows testing if a file exists
// or retrieving its content (ideally lazily and in a way that the memory can be retained later)
// I suspect that we'll end up with a FileSystem trait and our own Path abstraction.
#[derive(Clone, Default)]
pub struct Files {
inner: Arc<RwLock<FilesInner>>,
}
impl Files {
#[tracing::instrument(level = "debug", skip(self))]
pub fn intern(&self, path: &Path) -> FileId {
self.inner.write().intern(path)
}
pub fn try_get(&self, path: &Path) -> Option<FileId> {
self.inner.read().try_get(path)
}
#[tracing::instrument(level = "debug", skip(self))]
pub fn path(&self, id: FileId) -> Arc<Path> {
self.inner.read().path(id)
}
}
impl Debug for Files {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
let files = self.inner.read();
let mut debug = f.debug_map();
for item in files.iter() {
debug.entry(&item.0, &item.1);
}
debug.finish()
}
}
impl PartialEq for Files {
fn eq(&self, other: &Self) -> bool {
self.inner.read().eq(&other.inner.read())
}
}
impl Eq for Files {}
#[derive(Default)]
struct FilesInner {
by_path: Map<FileId, ()>,
// TODO should we use a map here to reclaim the space for removed files?
// TODO I think we should use our own path abstraction here to avoid having to normalize paths
// and dealing with non-utf paths everywhere.
by_id: IndexVec<FileId, Arc<Path>>,
}
impl FilesInner {
/// Inserts the path and returns a new id for it or returns the id if it is an existing path.
// TODO should this accept Path or PathBuf?
pub(crate) fn intern(&mut self, path: &Path) -> FileId {
let mut hasher = FxHasher::default();
path.hash(&mut hasher);
let hash = hasher.finish();
let entry = self
.by_path
.raw_entry_mut()
.from_hash(hash, |existing_file| &*self.by_id[*existing_file] == path);
match entry {
RawEntryMut::Occupied(entry) => *entry.key(),
RawEntryMut::Vacant(entry) => {
let id = self.by_id.push(Arc::from(path));
entry.insert_with_hasher(hash, id, (), |_| hash);
id
}
}
}
pub(crate) fn try_get(&self, path: &Path) -> Option<FileId> {
let mut hasher = FxHasher::default();
path.hash(&mut hasher);
let hash = hasher.finish();
Some(
*self
.by_path
.raw_entry()
.from_hash(hash, |existing_file| &*self.by_id[*existing_file] == path)?
.0,
)
}
/// Returns the path for the file with the given id.
pub(crate) fn path(&self, id: FileId) -> Arc<Path> {
self.by_id[id].clone()
}
pub(crate) fn iter(&self) -> impl Iterator<Item = (FileId, Arc<Path>)> + '_ {
self.by_path.keys().map(|id| (*id, self.by_id[*id].clone()))
}
}
impl PartialEq for FilesInner {
fn eq(&self, other: &Self) -> bool {
self.by_id == other.by_id
}
}
impl Eq for FilesInner {}
#[cfg(test)]
mod tests {
use super::*;
use std::path::PathBuf;
#[test]
fn insert_path_twice_same_id() {
let files = Files::default();
let path = PathBuf::from("foo/bar");
let id1 = files.intern(&path);
let id2 = files.intern(&path);
assert_eq!(id1, id2);
}
#[test]
fn insert_different_paths_different_ids() {
let files = Files::default();
let path1 = PathBuf::from("foo/bar");
let path2 = PathBuf::from("foo/bar/baz");
let id1 = files.intern(&path1);
let id2 = files.intern(&path2);
assert_ne!(id1, id2);
}
}

View File

@@ -0,0 +1,67 @@
//! Key observations
//!
//! The HIR avoids allocations to large extends by:
//! * Using an arena per node type
//! * using ids and id ranges to reference items.
//!
//! Using separate arena per node type has the advantage that the IDs are relatively stable, because
//! they only change when a node of the same kind has been added or removed. (What's unclear is if that matters or if
//! it still triggers a re-compute because the AST-id in the node has changed).
//!
//! The HIR does not store all details. It mainly stores the *public* interface. There's a reference
//! back to the AST node to get more details.
//!
//!
use crate::ast_ids::{HasAstId, TypedAstId};
use crate::files::FileId;
use std::fmt::Formatter;
use std::hash::{Hash, Hasher};
pub struct HirAstId<N: HasAstId> {
file_id: FileId,
node_id: TypedAstId<N>,
}
impl<N: HasAstId> Copy for HirAstId<N> {}
impl<N: HasAstId> Clone for HirAstId<N> {
fn clone(&self) -> Self {
*self
}
}
impl<N: HasAstId> PartialEq for HirAstId<N> {
fn eq(&self, other: &Self) -> bool {
self.file_id == other.file_id && self.node_id == other.node_id
}
}
impl<N: HasAstId> Eq for HirAstId<N> {}
impl<N: HasAstId> std::fmt::Debug for HirAstId<N> {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("HirAstId")
.field("file_id", &self.file_id)
.field("node_id", &self.node_id)
.finish()
}
}
impl<N: HasAstId> Hash for HirAstId<N> {
fn hash<H: Hasher>(&self, state: &mut H) {
self.file_id.hash(state);
self.node_id.hash(state);
}
}
impl<N: HasAstId> HirAstId<N> {
pub fn upcast<M: HasAstId>(self) -> HirAstId<M>
where
N: Into<M>,
{
HirAstId {
file_id: self.file_id,
node_id: self.node_id.upcast(),
}
}
}

View File

@@ -0,0 +1,556 @@
use std::ops::{Index, Range};
use ruff_index::{newtype_index, IndexVec};
use ruff_python_ast::visitor::preorder;
use ruff_python_ast::visitor::preorder::PreorderVisitor;
use ruff_python_ast::{
Decorator, ExceptHandler, ExceptHandlerExceptHandler, Expr, MatchCase, ModModule, Stmt,
StmtAnnAssign, StmtAssign, StmtClassDef, StmtFunctionDef, StmtGlobal, StmtImport,
StmtImportFrom, StmtNonlocal, StmtTypeAlias, TypeParam, TypeParamParamSpec, TypeParamTypeVar,
TypeParamTypeVarTuple, WithItem,
};
use crate::ast_ids::{AstIds, HasAstId};
use crate::files::FileId;
use crate::hir::HirAstId;
use crate::Name;
#[newtype_index]
pub struct FunctionId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Function {
ast_id: HirAstId<StmtFunctionDef>,
name: Name,
parameters: Range<ParameterId>,
type_parameters: Range<TypeParameterId>, // TODO: type_parameters, return expression, decorators
}
#[newtype_index]
pub struct ParameterId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Parameter {
kind: ParameterKind,
name: Name,
default: Option<()>, // TODO use expression HIR
ast_id: HirAstId<ruff_python_ast::Parameter>,
}
// TODO or should `Parameter` be an enum?
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub enum ParameterKind {
PositionalOnly,
Arguments,
Vararg,
KeywordOnly,
Kwarg,
}
#[newtype_index]
pub struct ClassId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Class {
name: Name,
ast_id: HirAstId<StmtClassDef>,
// TODO type parameters, inheritance, decorators, members
}
#[newtype_index]
pub struct AssignmentId;
// This can have more than one name...
// but that means we can't implement `name()` on `ModuleItem`.
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Assignment {
// TODO: Handle multiple names / targets
name: Name,
ast_id: HirAstId<StmtAssign>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct AnnotatedAssignment {
name: Name,
ast_id: HirAstId<StmtAnnAssign>,
}
#[newtype_index]
pub struct AnnotatedAssignmentId;
#[newtype_index]
pub struct TypeAliasId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeAlias {
name: Name,
ast_id: HirAstId<StmtTypeAlias>,
parameters: Range<TypeParameterId>,
}
#[newtype_index]
pub struct TypeParameterId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub enum TypeParameter {
TypeVar(TypeParameterTypeVar),
ParamSpec(TypeParameterParamSpec),
TypeVarTuple(TypeParameterTypeVarTuple),
}
impl TypeParameter {
pub fn ast_id(&self) -> HirAstId<TypeParam> {
match self {
TypeParameter::TypeVar(type_var) => type_var.ast_id.upcast(),
TypeParameter::ParamSpec(param_spec) => param_spec.ast_id.upcast(),
TypeParameter::TypeVarTuple(type_var_tuple) => type_var_tuple.ast_id.upcast(),
}
}
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeParameterTypeVar {
name: Name,
ast_id: HirAstId<TypeParamTypeVar>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeParameterParamSpec {
name: Name,
ast_id: HirAstId<TypeParamParamSpec>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct TypeParameterTypeVarTuple {
name: Name,
ast_id: HirAstId<TypeParamTypeVarTuple>,
}
#[newtype_index]
pub struct GlobalId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Global {
// TODO track names
ast_id: HirAstId<StmtGlobal>,
}
#[newtype_index]
pub struct NonLocalId;
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct NonLocal {
// TODO track names
ast_id: HirAstId<StmtNonlocal>,
}
pub enum DefinitionId {
Function(FunctionId),
Parameter(ParameterId),
Class(ClassId),
Assignment(AssignmentId),
AnnotatedAssignment(AnnotatedAssignmentId),
Global(GlobalId),
NonLocal(NonLocalId),
TypeParameter(TypeParameterId),
TypeAlias(TypeAlias),
}
pub enum DefinitionItem {
Function(Function),
Parameter(Parameter),
Class(Class),
Assignment(Assignment),
AnnotatedAssignment(AnnotatedAssignment),
Global(Global),
NonLocal(NonLocal),
TypeParameter(TypeParameter),
TypeAlias(TypeAlias),
}
// The closest is rust-analyzers item-tree. It only represents "Items" which make the public interface of a module
// (it excludes any other statement or expressions). rust-analyzer uses it as the main input to the name resolution
// algorithm
// > It is the input to the name resolution algorithm, as well as to the queries defined in `adt.rs`,
// > `data.rs`, and most things in `attr.rs`.
//
// > One important purpose of this layer is to provide an "invalidation barrier" for incremental
// > computations: when typing inside an item body, the `ItemTree` of the modified file is typically
// > unaffected, so we don't have to recompute name resolution results or item data (see `data.rs`).
//
// I haven't fully figured this out but I think that this composes the "public" interface of a module?
// But maybe that's too optimistic.
//
//
#[derive(Debug, Clone, Default, Eq, PartialEq)]
pub struct Definitions {
functions: IndexVec<FunctionId, Function>,
parameters: IndexVec<ParameterId, Parameter>,
classes: IndexVec<ClassId, Class>,
assignments: IndexVec<AssignmentId, Assignment>,
annotated_assignments: IndexVec<AnnotatedAssignmentId, AnnotatedAssignment>,
type_aliases: IndexVec<TypeAliasId, TypeAlias>,
type_parameters: IndexVec<TypeParameterId, TypeParameter>,
globals: IndexVec<GlobalId, Global>,
non_locals: IndexVec<NonLocalId, NonLocal>,
}
impl Definitions {
pub fn from_module(module: &ModModule, ast_ids: &AstIds, file_id: FileId) -> Self {
let mut visitor = DefinitionsVisitor {
definitions: Definitions::default(),
ast_ids,
file_id,
};
visitor.visit_body(&module.body);
visitor.definitions
}
}
impl Index<FunctionId> for Definitions {
type Output = Function;
fn index(&self, index: FunctionId) -> &Self::Output {
&self.functions[index]
}
}
impl Index<ParameterId> for Definitions {
type Output = Parameter;
fn index(&self, index: ParameterId) -> &Self::Output {
&self.parameters[index]
}
}
impl Index<ClassId> for Definitions {
type Output = Class;
fn index(&self, index: ClassId) -> &Self::Output {
&self.classes[index]
}
}
impl Index<AssignmentId> for Definitions {
type Output = Assignment;
fn index(&self, index: AssignmentId) -> &Self::Output {
&self.assignments[index]
}
}
impl Index<AnnotatedAssignmentId> for Definitions {
type Output = AnnotatedAssignment;
fn index(&self, index: AnnotatedAssignmentId) -> &Self::Output {
&self.annotated_assignments[index]
}
}
impl Index<TypeAliasId> for Definitions {
type Output = TypeAlias;
fn index(&self, index: TypeAliasId) -> &Self::Output {
&self.type_aliases[index]
}
}
impl Index<GlobalId> for Definitions {
type Output = Global;
fn index(&self, index: GlobalId) -> &Self::Output {
&self.globals[index]
}
}
impl Index<NonLocalId> for Definitions {
type Output = NonLocal;
fn index(&self, index: NonLocalId) -> &Self::Output {
&self.non_locals[index]
}
}
impl Index<TypeParameterId> for Definitions {
type Output = TypeParameter;
fn index(&self, index: TypeParameterId) -> &Self::Output {
&self.type_parameters[index]
}
}
struct DefinitionsVisitor<'a> {
definitions: Definitions,
ast_ids: &'a AstIds,
file_id: FileId,
}
impl DefinitionsVisitor<'_> {
fn ast_id<N: HasAstId>(&self, node: &N) -> HirAstId<N> {
HirAstId {
file_id: self.file_id,
node_id: self.ast_ids.ast_id(node),
}
}
fn lower_function_def(&mut self, function: &StmtFunctionDef) -> FunctionId {
let name = Name::new(&function.name);
let first_type_parameter_id = self.definitions.type_parameters.next_index();
let mut last_type_parameter_id = first_type_parameter_id;
if let Some(type_params) = &function.type_params {
for parameter in &type_params.type_params {
let id = self.lower_type_parameter(parameter);
last_type_parameter_id = id;
}
}
let parameters = self.lower_parameters(&function.parameters);
self.definitions.functions.push(Function {
name,
ast_id: self.ast_id(function),
parameters,
type_parameters: first_type_parameter_id..last_type_parameter_id,
})
}
fn lower_parameters(&mut self, parameters: &ruff_python_ast::Parameters) -> Range<ParameterId> {
let first_parameter_id = self.definitions.parameters.next_index();
let mut last_parameter_id = first_parameter_id;
for parameter in &parameters.posonlyargs {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::PositionalOnly,
name: Name::new(&parameter.parameter.name),
default: None,
ast_id: self.ast_id(&parameter.parameter),
});
}
if let Some(vararg) = &parameters.vararg {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::Vararg,
name: Name::new(&vararg.name),
default: None,
ast_id: self.ast_id(vararg),
});
}
for parameter in &parameters.kwonlyargs {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::KeywordOnly,
name: Name::new(&parameter.parameter.name),
default: None,
ast_id: self.ast_id(&parameter.parameter),
});
}
if let Some(kwarg) = &parameters.kwarg {
last_parameter_id = self.definitions.parameters.push(Parameter {
kind: ParameterKind::KeywordOnly,
name: Name::new(&kwarg.name),
default: None,
ast_id: self.ast_id(kwarg),
});
}
first_parameter_id..last_parameter_id
}
fn lower_class_def(&mut self, class: &StmtClassDef) -> ClassId {
let name = Name::new(&class.name);
self.definitions.classes.push(Class {
name,
ast_id: self.ast_id(class),
})
}
fn lower_assignment(&mut self, assignment: &StmtAssign) {
// FIXME handle multiple names
if let Some(Expr::Name(name)) = assignment.targets.first() {
self.definitions.assignments.push(Assignment {
name: Name::new(&name.id),
ast_id: self.ast_id(assignment),
});
}
}
fn lower_annotated_assignment(&mut self, annotated_assignment: &StmtAnnAssign) {
if let Expr::Name(name) = &*annotated_assignment.target {
self.definitions
.annotated_assignments
.push(AnnotatedAssignment {
name: Name::new(&name.id),
ast_id: self.ast_id(annotated_assignment),
});
}
}
fn lower_type_alias(&mut self, type_alias: &StmtTypeAlias) {
if let Expr::Name(name) = &*type_alias.name {
let name = Name::new(&name.id);
let lower_parameters_id = self.definitions.type_parameters.next_index();
let mut last_parameter_id = lower_parameters_id;
if let Some(type_params) = &type_alias.type_params {
for type_parameter in &type_params.type_params {
let id = self.lower_type_parameter(type_parameter);
last_parameter_id = id;
}
}
self.definitions.type_aliases.push(TypeAlias {
name,
ast_id: self.ast_id(type_alias),
parameters: lower_parameters_id..last_parameter_id,
});
}
}
fn lower_type_parameter(&mut self, type_parameter: &TypeParam) -> TypeParameterId {
match type_parameter {
TypeParam::TypeVar(type_var) => {
self.definitions
.type_parameters
.push(TypeParameter::TypeVar(TypeParameterTypeVar {
name: Name::new(&type_var.name),
ast_id: self.ast_id(type_var),
}))
}
TypeParam::ParamSpec(param_spec) => {
self.definitions
.type_parameters
.push(TypeParameter::ParamSpec(TypeParameterParamSpec {
name: Name::new(&param_spec.name),
ast_id: self.ast_id(param_spec),
}))
}
TypeParam::TypeVarTuple(type_var_tuple) => {
self.definitions
.type_parameters
.push(TypeParameter::TypeVarTuple(TypeParameterTypeVarTuple {
name: Name::new(&type_var_tuple.name),
ast_id: self.ast_id(type_var_tuple),
}))
}
}
}
fn lower_import(&mut self, _import: &StmtImport) {
// TODO
}
fn lower_import_from(&mut self, _import_from: &StmtImportFrom) {
// TODO
}
fn lower_global(&mut self, global: &StmtGlobal) -> GlobalId {
self.definitions.globals.push(Global {
ast_id: self.ast_id(global),
})
}
fn lower_non_local(&mut self, non_local: &StmtNonlocal) -> NonLocalId {
self.definitions.non_locals.push(NonLocal {
ast_id: self.ast_id(non_local),
})
}
fn lower_except_handler(&mut self, _except_handler: &ExceptHandlerExceptHandler) {
// TODO
}
fn lower_with_item(&mut self, _with_item: &WithItem) {
// TODO
}
fn lower_match_case(&mut self, _match_case: &MatchCase) {
// TODO
}
}
impl PreorderVisitor<'_> for DefinitionsVisitor<'_> {
fn visit_stmt(&mut self, stmt: &Stmt) {
match stmt {
// Definition statements
Stmt::FunctionDef(definition) => {
self.lower_function_def(definition);
self.visit_body(&definition.body);
}
Stmt::ClassDef(definition) => {
self.lower_class_def(definition);
self.visit_body(&definition.body);
}
Stmt::Assign(assignment) => {
self.lower_assignment(assignment);
}
Stmt::AnnAssign(annotated_assignment) => {
self.lower_annotated_assignment(annotated_assignment);
}
Stmt::TypeAlias(type_alias) => {
self.lower_type_alias(type_alias);
}
Stmt::Import(import) => self.lower_import(import),
Stmt::ImportFrom(import_from) => self.lower_import_from(import_from),
Stmt::Global(global) => {
self.lower_global(global);
}
Stmt::Nonlocal(non_local) => {
self.lower_non_local(non_local);
}
// Visit the compound statement bodies because they can contain other definitions.
Stmt::For(_)
| Stmt::While(_)
| Stmt::If(_)
| Stmt::With(_)
| Stmt::Match(_)
| Stmt::Try(_) => {
preorder::walk_stmt(self, stmt);
}
// Skip over simple statements because they can't contain any other definitions.
Stmt::Return(_)
| Stmt::Delete(_)
| Stmt::AugAssign(_)
| Stmt::Raise(_)
| Stmt::Assert(_)
| Stmt::Expr(_)
| Stmt::Pass(_)
| Stmt::Break(_)
| Stmt::Continue(_)
| Stmt::IpyEscapeCommand(_) => {
// No op
}
}
}
fn visit_expr(&mut self, _: &'_ Expr) {}
fn visit_decorator(&mut self, _decorator: &'_ Decorator) {}
fn visit_except_handler(&mut self, except_handler: &'_ ExceptHandler) {
match except_handler {
ExceptHandler::ExceptHandler(except_handler) => {
self.lower_except_handler(except_handler);
}
}
}
fn visit_with_item(&mut self, with_item: &'_ WithItem) {
self.lower_with_item(with_item);
}
fn visit_match_case(&mut self, match_case: &'_ MatchCase) {
self.lower_match_case(match_case);
self.visit_body(&match_case.body);
}
}

109
crates/red_knot/src/lib.rs Normal file
View File

@@ -0,0 +1,109 @@
use std::fmt::Formatter;
use std::hash::BuildHasherDefault;
use std::ops::Deref;
use std::path::{Path, PathBuf};
use rustc_hash::{FxHashSet, FxHasher};
use crate::files::FileId;
pub mod ast_ids;
pub mod cache;
pub mod cancellation;
pub mod db;
pub mod files;
pub mod hir;
pub mod lint;
pub mod module;
mod parse;
pub mod program;
pub mod source;
mod symbols;
mod types;
pub mod watch;
pub(crate) type FxDashMap<K, V> = dashmap::DashMap<K, V, BuildHasherDefault<FxHasher>>;
#[allow(unused)]
pub(crate) type FxDashSet<V> = dashmap::DashSet<V, BuildHasherDefault<FxHasher>>;
pub(crate) type FxIndexSet<V> = indexmap::set::IndexSet<V, BuildHasherDefault<FxHasher>>;
#[derive(Debug)]
pub struct Workspace {
/// TODO this should be a resolved path. We should probably use a newtype wrapper that guarantees that
/// PATH is a UTF-8 path and is normalized.
root: PathBuf,
/// The files that are open in the workspace.
///
/// * Editor: The files that are actively being edited in the editor (the user has a tab open with the file).
/// * CLI: The resolved files passed as arguments to the CLI.
open_files: FxHashSet<FileId>,
}
impl Workspace {
pub fn new(root: PathBuf) -> Self {
Self {
root,
open_files: FxHashSet::default(),
}
}
pub fn root(&self) -> &Path {
self.root.as_path()
}
// TODO having the content in workspace feels wrong.
pub fn open_file(&mut self, file_id: FileId) {
self.open_files.insert(file_id);
}
pub fn close_file(&mut self, file_id: FileId) {
self.open_files.remove(&file_id);
}
// TODO introduce an `OpenFile` type instead of using an anonymous tuple.
pub fn open_files(&self) -> impl Iterator<Item = FileId> + '_ {
self.open_files.iter().copied()
}
pub fn is_file_open(&self, file_id: FileId) -> bool {
self.open_files.contains(&file_id)
}
}
#[derive(Debug, Clone, Eq, PartialEq, Hash)]
pub struct Name(smol_str::SmolStr);
impl Name {
#[inline]
pub fn new(name: &str) -> Self {
Self(smol_str::SmolStr::new(name))
}
pub fn as_str(&self) -> &str {
self.0.as_str()
}
}
impl Deref for Name {
type Target = str;
#[inline]
fn deref(&self) -> &Self::Target {
self.as_str()
}
}
impl<T> From<T> for Name
where
T: Into<smol_str::SmolStr>,
{
fn from(value: T) -> Self {
Self(value.into())
}
}
impl std::fmt::Display for Name {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.write_str(self.as_str())
}
}

252
crates/red_knot/src/lint.rs Normal file
View File

@@ -0,0 +1,252 @@
use std::cell::RefCell;
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{ModModule, StringLiteral};
use crate::cache::KeyValueCache;
use crate::db::{HasJar, SemanticDb, SemanticJar, SourceDb, SourceJar};
use crate::files::FileId;
use crate::parse::Parsed;
use crate::source::Source;
use crate::symbols::{Definition, SymbolId, SymbolTable};
use crate::types::Type;
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn lint_syntax<Db>(db: &Db, file_id: FileId) -> Diagnostics
where
Db: SourceDb + HasJar<SourceJar>,
{
let storage = &db.jar().lint_syntax;
storage.get(&file_id, |file_id| {
let mut diagnostics = Vec::new();
let source = db.source(*file_id);
lint_lines(source.text(), &mut diagnostics);
let parsed = db.parse(*file_id);
if parsed.errors().is_empty() {
let ast = parsed.ast();
let mut visitor = SyntaxLintVisitor {
diagnostics,
source: source.text(),
};
visitor.visit_body(&ast.body);
diagnostics = visitor.diagnostics;
} else {
diagnostics.extend(parsed.errors().iter().map(std::string::ToString::to_string));
}
Diagnostics::from(diagnostics)
})
}
fn lint_lines(source: &str, diagnostics: &mut Vec<String>) {
for (line_number, line) in source.lines().enumerate() {
if line.len() < 88 {
continue;
}
let char_count = line.chars().count();
if char_count > 88 {
diagnostics.push(format!(
"Line {} is too long ({} characters)",
line_number + 1,
char_count
));
}
}
}
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn lint_semantic<Db>(db: &Db, file_id: FileId) -> Diagnostics
where
Db: SemanticDb + HasJar<SemanticJar>,
{
let storage = &db.jar().lint_semantic;
storage.get(&file_id, |file_id| {
let source = db.source(*file_id);
let parsed = db.parse(*file_id);
let symbols = db.symbol_table(*file_id);
let context = SemanticLintContext {
file_id: *file_id,
source,
parsed,
symbols,
db,
diagnostics: RefCell::new(Vec::new()),
};
lint_unresolved_imports(&context);
Diagnostics::from(context.diagnostics.take())
})
}
fn lint_unresolved_imports(context: &SemanticLintContext) {
// TODO: Consider iterating over the dependencies (imports) only instead of all definitions.
for (symbol, definition) in context.symbols().all_definitions() {
match definition {
Definition::Import(import) => {
let ty = context.eval_symbol(symbol);
if ty.is_unknown() {
context.push_diagnostic(format!("Unresolved module {}", import.module));
}
}
Definition::ImportFrom(import) => {
let ty = context.eval_symbol(symbol);
if ty.is_unknown() {
let module_name = import.module().map(Deref::deref).unwrap_or_default();
let message = if import.level() > 0 {
format!(
"Unresolved relative import '{}' from {}{}",
import.name(),
".".repeat(import.level() as usize),
module_name
)
} else {
format!(
"Unresolved import '{}' from '{}'",
import.name(),
module_name
)
};
context.push_diagnostic(message);
}
}
_ => {}
}
}
}
pub struct SemanticLintContext<'a> {
file_id: FileId,
source: Source,
parsed: Parsed,
symbols: Arc<SymbolTable>,
db: &'a dyn SemanticDb,
diagnostics: RefCell<Vec<String>>,
}
impl<'a> SemanticLintContext<'a> {
pub fn source_text(&self) -> &str {
self.source.text()
}
pub fn file_id(&self) -> FileId {
self.file_id
}
pub fn ast(&self) -> &ModModule {
self.parsed.ast()
}
pub fn symbols(&self) -> &SymbolTable {
&self.symbols
}
pub fn eval_symbol(&self, symbol_id: SymbolId) -> Type {
self.db.infer_symbol_type(self.file_id, symbol_id)
}
pub fn push_diagnostic(&self, diagnostic: String) {
self.diagnostics.borrow_mut().push(diagnostic);
}
pub fn extend_diagnostics(&mut self, diagnostics: impl IntoIterator<Item = String>) {
self.diagnostics.get_mut().extend(diagnostics);
}
}
#[derive(Debug)]
struct SyntaxLintVisitor<'a> {
diagnostics: Vec<String>,
source: &'a str,
}
impl Visitor<'_> for SyntaxLintVisitor<'_> {
fn visit_string_literal(&mut self, string_literal: &'_ StringLiteral) {
// A very naive implementation of use double quotes
let text = &self.source[string_literal.range];
if text.starts_with('\'') {
self.diagnostics
.push("Use double quotes for strings".to_string());
}
}
}
#[derive(Debug, Clone)]
pub enum Diagnostics {
Empty,
List(Arc<Vec<String>>),
}
impl Diagnostics {
pub fn as_slice(&self) -> &[String] {
match self {
Diagnostics::Empty => &[],
Diagnostics::List(list) => list.as_slice(),
}
}
}
impl Deref for Diagnostics {
type Target = [String];
fn deref(&self) -> &Self::Target {
self.as_slice()
}
}
impl From<Vec<String>> for Diagnostics {
fn from(value: Vec<String>) -> Self {
if value.is_empty() {
Diagnostics::Empty
} else {
Diagnostics::List(Arc::new(value))
}
}
}
#[derive(Default, Debug)]
pub struct LintSyntaxStorage(KeyValueCache<FileId, Diagnostics>);
impl Deref for LintSyntaxStorage {
type Target = KeyValueCache<FileId, Diagnostics>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for LintSyntaxStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
#[derive(Default, Debug)]
pub struct LintSemanticStorage(KeyValueCache<FileId, Diagnostics>);
impl Deref for LintSemanticStorage {
type Target = KeyValueCache<FileId, Diagnostics>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for LintSemanticStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}

469
crates/red_knot/src/main.rs Normal file
View File

@@ -0,0 +1,469 @@
#![allow(clippy::dbg_macro)]
use std::collections::hash_map::Entry;
use std::path::Path;
use std::sync::Mutex;
use rustc_hash::FxHashMap;
use tracing::subscriber::Interest;
use tracing::{Level, Metadata};
use tracing_subscriber::filter::LevelFilter;
use tracing_subscriber::layer::{Context, Filter, SubscriberExt};
use tracing_subscriber::{Layer, Registry};
use tracing_tree::time::Uptime;
use red_knot::cancellation::CancellationTokenSource;
use red_knot::db::{HasJar, SourceDb, SourceJar};
use red_knot::files::FileId;
use red_knot::module::{ModuleSearchPath, ModuleSearchPathKind};
use red_knot::program::check::{CheckError, RayonCheckScheduler};
use red_knot::program::{FileChange, FileChangeKind, Program};
use red_knot::watch::FileWatcher;
use red_knot::Workspace;
#[allow(clippy::print_stdout, clippy::unnecessary_wraps, clippy::print_stderr)]
fn main() -> anyhow::Result<()> {
setup_tracing();
let arguments: Vec<_> = std::env::args().collect();
if arguments.len() < 2 {
eprintln!("Usage: red_knot <path>");
return Err(anyhow::anyhow!("Invalid arguments"));
}
let entry_point = Path::new(&arguments[1]);
if !entry_point.exists() {
eprintln!("The entry point does not exist.");
return Err(anyhow::anyhow!("Invalid arguments"));
}
if !entry_point.is_file() {
eprintln!("The entry point is not a file.");
return Err(anyhow::anyhow!("Invalid arguments"));
}
let workspace_folder = entry_point.parent().unwrap();
let workspace = Workspace::new(workspace_folder.to_path_buf());
let workspace_search_path = ModuleSearchPath::new(
workspace.root().to_path_buf(),
ModuleSearchPathKind::FirstParty,
);
let mut program = Program::new(workspace, vec![workspace_search_path]);
let entry_id = program.file_id(entry_point);
program.workspace_mut().open_file(entry_id);
let (main_loop, main_loop_cancellation_token) = MainLoop::new();
// Listen to Ctrl+C and abort the watch mode.
let main_loop_cancellation_token = Mutex::new(Some(main_loop_cancellation_token));
ctrlc::set_handler(move || {
let mut lock = main_loop_cancellation_token.lock().unwrap();
if let Some(token) = lock.take() {
token.stop();
}
})?;
let file_changes_notifier = main_loop.file_changes_notifier();
// Watch for file changes and re-trigger the analysis.
let mut file_watcher = FileWatcher::new(
move |changes| {
file_changes_notifier.notify(changes);
},
program.files().clone(),
)?;
file_watcher.watch_folder(workspace_folder)?;
main_loop.run(&mut program);
let source_jar: &SourceJar = program.jar();
dbg!(source_jar.parsed.statistics());
dbg!(source_jar.sources.statistics());
Ok(())
}
struct MainLoop {
orchestrator_sender: crossbeam_channel::Sender<OrchestratorMessage>,
main_loop_receiver: crossbeam_channel::Receiver<MainLoopMessage>,
}
impl MainLoop {
fn new() -> (Self, MainLoopCancellationToken) {
let (orchestrator_sender, orchestrator_receiver) = crossbeam_channel::bounded(1);
let (main_loop_sender, main_loop_receiver) = crossbeam_channel::bounded(1);
let mut orchestrator = Orchestrator {
pending_analysis: None,
receiver: orchestrator_receiver,
sender: main_loop_sender.clone(),
aggregated_changes: AggregatedChanges::default(),
};
std::thread::spawn(move || {
orchestrator.run();
});
(
Self {
orchestrator_sender,
main_loop_receiver,
},
MainLoopCancellationToken {
sender: main_loop_sender,
},
)
}
fn file_changes_notifier(&self) -> FileChangesNotifier {
FileChangesNotifier {
sender: self.orchestrator_sender.clone(),
}
}
fn run(self, program: &mut Program) {
self.orchestrator_sender
.send(OrchestratorMessage::Run)
.unwrap();
for message in &self.main_loop_receiver {
tracing::trace!("Main Loop: Tick");
match message {
MainLoopMessage::CheckProgram => {
// Remove mutability from program.
let program = &*program;
let run_cancellation_token_source = CancellationTokenSource::new();
let run_cancellation_token = run_cancellation_token_source.token();
let sender = &self.orchestrator_sender;
sender
.send(OrchestratorMessage::CheckProgramStarted {
cancellation_token: run_cancellation_token_source,
})
.unwrap();
rayon::in_place_scope(|scope| {
let scheduler = RayonCheckScheduler::new(program, scope);
let result = program.check(&scheduler, run_cancellation_token);
match result {
Ok(result) => sender
.send(OrchestratorMessage::CheckProgramCompleted(result))
.unwrap(),
Err(CheckError::Cancelled) => sender
.send(OrchestratorMessage::CheckProgramCancelled)
.unwrap(),
}
});
}
MainLoopMessage::ApplyChanges(changes) => {
program.apply_changes(changes.iter());
}
MainLoopMessage::CheckCompleted(diagnostics) => {
dbg!(diagnostics);
}
MainLoopMessage::Exit => {
return;
}
}
}
}
}
impl Drop for MainLoop {
fn drop(&mut self) {
self.orchestrator_sender
.send(OrchestratorMessage::Shutdown)
.unwrap();
}
}
#[derive(Debug, Clone)]
struct FileChangesNotifier {
sender: crossbeam_channel::Sender<OrchestratorMessage>,
}
impl FileChangesNotifier {
fn notify(&self, changes: Vec<FileChange>) {
self.sender
.send(OrchestratorMessage::FileChanges(changes))
.unwrap();
}
}
#[derive(Debug)]
struct MainLoopCancellationToken {
sender: crossbeam_channel::Sender<MainLoopMessage>,
}
impl MainLoopCancellationToken {
fn stop(self) {
self.sender.send(MainLoopMessage::Exit).unwrap();
}
}
struct Orchestrator {
aggregated_changes: AggregatedChanges,
pending_analysis: Option<PendingAnalysisState>,
/// Sends messages to the main loop.
sender: crossbeam_channel::Sender<MainLoopMessage>,
/// Receives messages from the main loop.
receiver: crossbeam_channel::Receiver<OrchestratorMessage>,
}
impl Orchestrator {
fn run(&mut self) {
while let Ok(message) = self.receiver.recv() {
match message {
OrchestratorMessage::Run => {
self.pending_analysis = None;
self.sender.send(MainLoopMessage::CheckProgram).unwrap();
}
OrchestratorMessage::CheckProgramStarted { cancellation_token } => {
debug_assert!(self.pending_analysis.is_none());
self.pending_analysis = Some(PendingAnalysisState { cancellation_token });
}
OrchestratorMessage::CheckProgramCompleted(diagnostics) => {
self.pending_analysis
.take()
.expect("Expected a pending analysis.");
self.sender
.send(MainLoopMessage::CheckCompleted(diagnostics))
.unwrap();
}
OrchestratorMessage::CheckProgramCancelled => {
self.pending_analysis
.take()
.expect("Expected a pending analysis.");
self.debounce_changes();
}
OrchestratorMessage::FileChanges(changes) => {
// Request cancellation, but wait until all analysis tasks have completed to
// avoid stale messages in the next main loop.
let pending = if let Some(pending_state) = self.pending_analysis.as_ref() {
pending_state.cancellation_token.cancel();
true
} else {
false
};
self.aggregated_changes.extend(changes);
// If there are no pending analysis tasks, apply the file changes. Otherwise
// keep running until all file checks have completed.
if !pending {
self.debounce_changes();
}
}
OrchestratorMessage::Shutdown => {
return self.shutdown();
}
}
}
}
fn debounce_changes(&mut self) {
debug_assert!(self.pending_analysis.is_none());
loop {
// Consume possibly incoming file change messages before running a new analysis, but don't wait for more than 100ms.
crossbeam_channel::select! {
recv(self.receiver) -> message => {
match message {
Ok(OrchestratorMessage::Shutdown) => {
return self.shutdown();
}
Ok(OrchestratorMessage::FileChanges(file_changes)) => {
self.aggregated_changes.extend(file_changes);
}
Ok(OrchestratorMessage::CheckProgramStarted {..}| OrchestratorMessage::CheckProgramCompleted(_) | OrchestratorMessage::CheckProgramCancelled) => unreachable!("No program check should be running while debouncing changes."),
Ok(OrchestratorMessage::Run) => unreachable!("The orchestrator is already running."),
Err(_) => {
// There are no more senders, no point in waiting for more messages
return;
}
}
},
default(std::time::Duration::from_millis(100)) => {
// No more file changes after 100 ms, send the changes and schedule a new analysis
self.sender.send(MainLoopMessage::ApplyChanges(std::mem::take(&mut self.aggregated_changes))).unwrap();
self.sender.send(MainLoopMessage::CheckProgram).unwrap();
return;
}
}
}
}
#[allow(clippy::unused_self)]
fn shutdown(&self) {
tracing::trace!("Shutting down orchestrator.");
}
}
#[derive(Debug)]
struct PendingAnalysisState {
cancellation_token: CancellationTokenSource,
}
/// Message sent from the orchestrator to the main loop.
#[derive(Debug)]
enum MainLoopMessage {
CheckProgram,
CheckCompleted(Vec<String>),
ApplyChanges(AggregatedChanges),
Exit,
}
#[derive(Debug)]
enum OrchestratorMessage {
Run,
Shutdown,
CheckProgramStarted {
cancellation_token: CancellationTokenSource,
},
CheckProgramCompleted(Vec<String>),
CheckProgramCancelled,
FileChanges(Vec<FileChange>),
}
#[derive(Default, Debug)]
struct AggregatedChanges {
changes: FxHashMap<FileId, FileChangeKind>,
}
impl AggregatedChanges {
fn add(&mut self, change: FileChange) {
match self.changes.entry(change.file_id()) {
Entry::Occupied(mut entry) => {
let merged = entry.get_mut();
match (merged, change.kind()) {
(FileChangeKind::Created, FileChangeKind::Deleted) => {
// Deletion after creations means that ruff never saw the file.
entry.remove();
}
(FileChangeKind::Created, FileChangeKind::Modified) => {
// No-op, for ruff, modifying a file that it doesn't yet know that it exists is still considered a creation.
}
(FileChangeKind::Modified, FileChangeKind::Created) => {
// Uhh, that should probably not happen. Continue considering it a modification.
}
(FileChangeKind::Modified, FileChangeKind::Deleted) => {
*entry.get_mut() = FileChangeKind::Deleted;
}
(FileChangeKind::Deleted, FileChangeKind::Created) => {
*entry.get_mut() = FileChangeKind::Modified;
}
(FileChangeKind::Deleted, FileChangeKind::Modified) => {
// That's weird, but let's consider it a modification.
*entry.get_mut() = FileChangeKind::Modified;
}
(FileChangeKind::Created, FileChangeKind::Created)
| (FileChangeKind::Modified, FileChangeKind::Modified)
| (FileChangeKind::Deleted, FileChangeKind::Deleted) => {
// No-op transitions. Some of them should be impossible but we handle them anyway.
}
}
}
Entry::Vacant(entry) => {
entry.insert(change.kind());
}
}
}
fn extend<I>(&mut self, changes: I)
where
I: IntoIterator<Item = FileChange>,
I::IntoIter: ExactSizeIterator,
{
let iter = changes.into_iter();
self.changes.reserve(iter.len());
for change in iter {
self.add(change);
}
}
fn iter(&self) -> impl Iterator<Item = FileChange> + '_ {
self.changes
.iter()
.map(|(id, kind)| FileChange::new(*id, *kind))
}
}
fn setup_tracing() {
let subscriber = Registry::default().with(
tracing_tree::HierarchicalLayer::default()
.with_indent_lines(true)
.with_indent_amount(2)
.with_bracketed_fields(true)
.with_thread_ids(true)
.with_targets(true)
.with_writer(|| Box::new(std::io::stderr()))
.with_timer(Uptime::default())
.with_filter(LoggingFilter {
trace_level: Level::TRACE,
}),
);
tracing::subscriber::set_global_default(subscriber).unwrap();
}
struct LoggingFilter {
trace_level: Level,
}
impl LoggingFilter {
fn is_enabled(&self, meta: &Metadata<'_>) -> bool {
let filter = if meta.target().starts_with("red_knot") || meta.target().starts_with("ruff") {
self.trace_level
} else {
Level::INFO
};
meta.level() <= &filter
}
}
impl<S> Filter<S> for LoggingFilter {
fn enabled(&self, meta: &Metadata<'_>, _cx: &Context<'_, S>) -> bool {
self.is_enabled(meta)
}
fn callsite_enabled(&self, meta: &'static Metadata<'static>) -> Interest {
if self.is_enabled(meta) {
Interest::always()
} else {
Interest::never()
}
}
fn max_level_hint(&self) -> Option<LevelFilter> {
Some(LevelFilter::from_level(self.trace_level))
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,95 @@
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use ruff_python_ast as ast;
use ruff_python_parser::{Mode, ParseError};
use ruff_text_size::{Ranged, TextRange};
use crate::cache::KeyValueCache;
use crate::db::{HasJar, SourceDb, SourceJar};
use crate::files::FileId;
#[derive(Debug, Clone, PartialEq)]
pub struct Parsed {
inner: Arc<ParsedInner>,
}
#[derive(Debug, PartialEq)]
struct ParsedInner {
ast: ast::ModModule,
errors: Vec<ParseError>,
}
impl Parsed {
fn new(ast: ast::ModModule, errors: Vec<ParseError>) -> Self {
Self {
inner: Arc::new(ParsedInner { ast, errors }),
}
}
pub(crate) fn from_text(text: &str) -> Self {
let result = ruff_python_parser::parse(text, Mode::Module);
let (module, errors) = match result {
Ok(ast::Mod::Module(module)) => (module, vec![]),
Ok(ast::Mod::Expression(expression)) => (
ast::ModModule {
range: expression.range(),
body: vec![ast::Stmt::Expr(ast::StmtExpr {
range: expression.range(),
value: expression.body,
})],
},
vec![],
),
Err(errors) => (
ast::ModModule {
range: TextRange::default(),
body: Vec::new(),
},
vec![errors],
),
};
Parsed::new(module, errors)
}
pub fn ast(&self) -> &ast::ModModule {
&self.inner.ast
}
pub fn errors(&self) -> &[ParseError] {
&self.inner.errors
}
}
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn parse<Db>(db: &Db, file_id: FileId) -> Parsed
where
Db: SourceDb + HasJar<SourceJar>,
{
let parsed = db.jar();
parsed.parsed.get(&file_id, |file_id| {
let source = db.source(*file_id);
Parsed::from_text(source.text())
})
}
#[derive(Debug, Default)]
pub struct ParsedStorage(KeyValueCache<FileId, Parsed>);
impl Deref for ParsedStorage {
type Target = KeyValueCache<FileId, Parsed>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for ParsedStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}

View File

@@ -0,0 +1,315 @@
use std::num::NonZeroUsize;
use rayon::max_num_threads;
use rustc_hash::FxHashSet;
use crate::cancellation::CancellationToken;
use crate::db::{SemanticDb, SourceDb};
use crate::files::FileId;
use crate::lint::Diagnostics;
use crate::program::Program;
use crate::symbols::Dependency;
impl Program {
/// Checks all open files in the workspace and its dependencies.
#[tracing::instrument(level = "debug", skip_all)]
pub fn check(
&self,
scheduler: &dyn CheckScheduler,
cancellation_token: CancellationToken,
) -> Result<Vec<String>, CheckError> {
let check_loop = CheckFilesLoop::new(scheduler, cancellation_token);
check_loop.run(self.workspace().open_files.iter().copied())
}
/// Checks a single file and its dependencies.
#[tracing::instrument(level = "debug", skip(self, scheduler, cancellation_token))]
pub fn check_file(
&self,
file: FileId,
scheduler: &dyn CheckScheduler,
cancellation_token: CancellationToken,
) -> Result<Vec<String>, CheckError> {
let check_loop = CheckFilesLoop::new(scheduler, cancellation_token);
check_loop.run([file].into_iter())
}
#[tracing::instrument(level = "debug", skip(self, context))]
fn do_check_file(
&self,
file: FileId,
context: &CheckContext,
) -> Result<Diagnostics, CheckError> {
context.cancelled_ok()?;
let symbol_table = self.symbol_table(file);
let dependencies = symbol_table.dependencies();
if !dependencies.is_empty() {
let module = self.file_to_module(file);
// TODO scheduling all dependencies here is wasteful if we don't infer any types on them
// but I think that's unlikely, so it is okay?
// Anyway, we need to figure out a way to retrieve the dependencies of a module
// from the persistent cache. So maybe it should be a separate query after all.
for dependency in dependencies {
let dependency_name = match dependency {
Dependency::Module(name) => Some(name.clone()),
Dependency::Relative { .. } => module
.as_ref()
.and_then(|module| module.resolve_dependency(self, dependency)),
};
if let Some(dependency_name) = dependency_name {
// TODO We may want to have a different check functions for non-first-party
// files because we only need to index them and not check them.
// Supporting non-first-party code also requires supporting typing stubs.
if let Some(dependency) = self.resolve_module(dependency_name) {
if dependency.path(self).root().kind().is_first_party() {
context.schedule_check_file(dependency.path(self).file());
}
}
}
}
}
let mut diagnostics = Vec::new();
if self.workspace().is_file_open(file) {
diagnostics.extend_from_slice(&self.lint_syntax(file));
diagnostics.extend_from_slice(&self.lint_semantic(file));
}
Ok(Diagnostics::from(diagnostics))
}
}
/// Schedules checks for files.
pub trait CheckScheduler {
/// Schedules a check for a file.
///
/// The check can either be run immediately on the current thread or the check can be queued
/// in a thread pool and ran asynchronously.
///
/// The order in which scheduled checks are executed is not guaranteed.
///
/// The implementation should call [`CheckFileTask::run`] to execute the check.
fn check_file(&self, file_task: CheckFileTask);
/// The maximum number of checks that can be run concurrently.
///
/// Returns `None` if the checks run on the current thread (no concurrency).
fn max_concurrency(&self) -> Option<NonZeroUsize>;
}
/// Scheduler that runs checks on a rayon thread pool.
pub struct RayonCheckScheduler<'program, 'scope_ref, 'scope> {
program: &'program Program,
scope: &'scope_ref rayon::Scope<'scope>,
}
impl<'program, 'scope_ref, 'scope> RayonCheckScheduler<'program, 'scope_ref, 'scope> {
pub fn new(program: &'program Program, scope: &'scope_ref rayon::Scope<'scope>) -> Self {
Self { program, scope }
}
}
impl<'program, 'scope_ref, 'scope> CheckScheduler
for RayonCheckScheduler<'program, 'scope_ref, 'scope>
where
'program: 'scope,
{
fn check_file(&self, check_file_task: CheckFileTask) {
let child_span =
tracing::trace_span!("check_file", file_id = check_file_task.file_id.as_u32());
let program = self.program;
self.scope
.spawn(move |_| child_span.in_scope(|| check_file_task.run(program)));
}
fn max_concurrency(&self) -> Option<NonZeroUsize> {
Some(NonZeroUsize::new(max_num_threads()).unwrap_or(NonZeroUsize::MIN))
}
}
/// Scheduler that runs all checks on the current thread.
pub struct SameThreadCheckScheduler<'a> {
program: &'a Program,
}
impl<'a> SameThreadCheckScheduler<'a> {
pub fn new(program: &'a Program) -> Self {
Self { program }
}
}
impl CheckScheduler for SameThreadCheckScheduler<'_> {
fn check_file(&self, task: CheckFileTask) {
task.run(self.program);
}
fn max_concurrency(&self) -> Option<NonZeroUsize> {
None
}
}
#[derive(Debug, Clone)]
pub enum CheckError {
Cancelled,
}
#[derive(Debug)]
pub struct CheckFileTask {
file_id: FileId,
context: CheckContext,
}
impl CheckFileTask {
/// Runs the check and communicates the result to the orchestrator.
pub fn run(self, program: &Program) {
match program.do_check_file(self.file_id, &self.context) {
Ok(diagnostics) => self
.context
.sender
.send(CheckFileMessage::Completed(diagnostics))
.unwrap(),
Err(CheckError::Cancelled) => self
.context
.sender
.send(CheckFileMessage::Cancelled)
.unwrap(),
}
}
}
#[derive(Clone, Debug)]
struct CheckContext {
cancellation_token: CancellationToken,
sender: crossbeam_channel::Sender<CheckFileMessage>,
}
impl CheckContext {
fn new(
cancellation_token: CancellationToken,
sender: crossbeam_channel::Sender<CheckFileMessage>,
) -> Self {
Self {
cancellation_token,
sender,
}
}
/// Queues a new file for checking using the [`CheckScheduler`].
#[allow(unused)]
fn schedule_check_file(&self, file_id: FileId) {
self.sender.send(CheckFileMessage::Queue(file_id)).unwrap();
}
/// Returns `true` if the check has been cancelled.
fn is_cancelled(&self) -> bool {
self.cancellation_token.is_cancelled()
}
fn cancelled_ok(&self) -> Result<(), CheckError> {
if self.is_cancelled() {
Err(CheckError::Cancelled)
} else {
Ok(())
}
}
}
struct CheckFilesLoop<'a> {
scheduler: &'a dyn CheckScheduler,
cancellation_token: CancellationToken,
pending: usize,
queued_files: FxHashSet<FileId>,
}
impl<'a> CheckFilesLoop<'a> {
fn new(scheduler: &'a dyn CheckScheduler, cancellation_token: CancellationToken) -> Self {
Self {
scheduler,
cancellation_token,
queued_files: FxHashSet::default(),
pending: 0,
}
}
fn run(mut self, files: impl Iterator<Item = FileId>) -> Result<Vec<String>, CheckError> {
let (sender, receiver) = if let Some(max_concurrency) = self.scheduler.max_concurrency() {
crossbeam_channel::bounded(max_concurrency.get())
} else {
// The checks run on the current thread. That means it is necessary to store all messages
// or we risk deadlocking when the main loop never gets a chance to read the messages.
crossbeam_channel::unbounded()
};
let context = CheckContext::new(self.cancellation_token.clone(), sender.clone());
for file in files {
self.queue_file(file, context.clone())?;
}
self.run_impl(receiver, &context)
}
fn run_impl(
mut self,
receiver: crossbeam_channel::Receiver<CheckFileMessage>,
context: &CheckContext,
) -> Result<Vec<String>, CheckError> {
if self.cancellation_token.is_cancelled() {
return Err(CheckError::Cancelled);
}
let mut result = Vec::default();
for message in receiver {
match message {
CheckFileMessage::Completed(diagnostics) => {
result.extend_from_slice(&diagnostics);
self.pending -= 1;
if self.pending == 0 {
break;
}
}
CheckFileMessage::Queue(id) => {
self.queue_file(id, context.clone())?;
}
CheckFileMessage::Cancelled => {
return Err(CheckError::Cancelled);
}
}
}
Ok(result)
}
fn queue_file(&mut self, file_id: FileId, context: CheckContext) -> Result<(), CheckError> {
if context.is_cancelled() {
return Err(CheckError::Cancelled);
}
if self.queued_files.insert(file_id) {
self.pending += 1;
self.scheduler
.check_file(CheckFileTask { file_id, context });
}
Ok(())
}
}
enum CheckFileMessage {
Completed(Diagnostics),
Queue(FileId),
Cancelled,
}

View File

@@ -0,0 +1,183 @@
pub mod check;
use std::path::Path;
use std::sync::Arc;
use crate::db::{Db, HasJar, SemanticDb, SemanticJar, SourceDb, SourceJar};
use crate::files::{FileId, Files};
use crate::lint::{
lint_semantic, lint_syntax, Diagnostics, LintSemanticStorage, LintSyntaxStorage,
};
use crate::module::{
add_module, file_to_module, path_to_module, resolve_module, set_module_search_paths, Module,
ModuleData, ModuleName, ModuleResolver, ModuleSearchPath,
};
use crate::parse::{parse, Parsed, ParsedStorage};
use crate::source::{source_text, Source, SourceStorage};
use crate::symbols::{symbol_table, SymbolId, SymbolTable, SymbolTablesStorage};
use crate::types::{infer_symbol_type, Type, TypeStore};
use crate::Workspace;
#[derive(Debug)]
pub struct Program {
files: Files,
source: SourceJar,
semantic: SemanticJar,
workspace: Workspace,
}
impl Program {
pub fn new(workspace: Workspace, module_search_paths: Vec<ModuleSearchPath>) -> Self {
Self {
source: SourceJar {
sources: SourceStorage::default(),
parsed: ParsedStorage::default(),
lint_syntax: LintSyntaxStorage::default(),
},
semantic: SemanticJar {
module_resolver: ModuleResolver::new(module_search_paths),
symbol_tables: SymbolTablesStorage::default(),
type_store: TypeStore::default(),
lint_semantic: LintSemanticStorage::default(),
},
files: Files::default(),
workspace,
}
}
pub fn apply_changes<I>(&mut self, changes: I)
where
I: IntoIterator<Item = FileChange>,
{
for change in changes {
self.semantic
.module_resolver
.remove_module(&self.file_path(change.id));
self.semantic.symbol_tables.remove(&change.id);
self.source.sources.remove(&change.id);
self.source.parsed.remove(&change.id);
self.source.lint_syntax.remove(&change.id);
// TODO: remove all dependent modules as well
self.semantic.type_store.remove_module(change.id);
self.semantic.lint_semantic.remove(&change.id);
}
}
pub fn files(&self) -> &Files {
&self.files
}
pub fn workspace(&self) -> &Workspace {
&self.workspace
}
pub fn workspace_mut(&mut self) -> &mut Workspace {
&mut self.workspace
}
}
impl SourceDb for Program {
fn file_id(&self, path: &Path) -> FileId {
self.files.intern(path)
}
fn file_path(&self, file_id: FileId) -> Arc<Path> {
self.files.path(file_id)
}
fn source(&self, file_id: FileId) -> Source {
source_text(self, file_id)
}
fn parse(&self, file_id: FileId) -> Parsed {
parse(self, file_id)
}
fn lint_syntax(&self, file_id: FileId) -> Diagnostics {
lint_syntax(self, file_id)
}
}
impl SemanticDb for Program {
fn resolve_module(&self, name: ModuleName) -> Option<Module> {
resolve_module(self, name)
}
fn file_to_module(&self, file_id: FileId) -> Option<Module> {
file_to_module(self, file_id)
}
fn path_to_module(&self, path: &Path) -> Option<Module> {
path_to_module(self, path)
}
fn symbol_table(&self, file_id: FileId) -> Arc<SymbolTable> {
symbol_table(self, file_id)
}
fn infer_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> Type {
infer_symbol_type(self, file_id, symbol_id)
}
fn lint_semantic(&self, file_id: FileId) -> Diagnostics {
lint_semantic(self, file_id)
}
// Mutations
fn add_module(&mut self, path: &Path) -> Option<(Module, Vec<Arc<ModuleData>>)> {
add_module(self, path)
}
fn set_module_search_paths(&mut self, paths: Vec<ModuleSearchPath>) {
set_module_search_paths(self, paths);
}
}
impl Db for Program {}
impl HasJar<SourceJar> for Program {
fn jar(&self) -> &SourceJar {
&self.source
}
fn jar_mut(&mut self) -> &mut SourceJar {
&mut self.source
}
}
impl HasJar<SemanticJar> for Program {
fn jar(&self) -> &SemanticJar {
&self.semantic
}
fn jar_mut(&mut self) -> &mut SemanticJar {
&mut self.semantic
}
}
#[derive(Copy, Clone, Debug)]
pub struct FileChange {
id: FileId,
kind: FileChangeKind,
}
impl FileChange {
pub fn new(file_id: FileId, kind: FileChangeKind) -> Self {
Self { id: file_id, kind }
}
pub fn file_id(&self) -> FileId {
self.id
}
pub fn kind(&self) -> FileChangeKind {
self.kind
}
}
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum FileChangeKind {
Created,
Modified,
Deleted,
}

View File

@@ -0,0 +1,96 @@
use crate::cache::KeyValueCache;
use crate::db::{HasJar, SourceDb, SourceJar};
use ruff_notebook::Notebook;
use ruff_python_ast::PySourceType;
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use crate::files::FileId;
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn source_text<Db>(db: &Db, file_id: FileId) -> Source
where
Db: SourceDb + HasJar<SourceJar>,
{
let sources = &db.jar().sources;
sources.get(&file_id, |file_id| {
let path = db.file_path(*file_id);
let source_text = std::fs::read_to_string(&path).unwrap_or_else(|err| {
tracing::error!("Failed to read file '{path:?}: {err}'. Falling back to empty text");
String::new()
});
let python_ty = PySourceType::from(&path);
let kind = match python_ty {
PySourceType::Python => {
SourceKind::Python(Arc::from(source_text))
}
PySourceType::Stub => SourceKind::Stub(Arc::from(source_text)),
PySourceType::Ipynb => {
let notebook = Notebook::from_source_code(&source_text).unwrap_or_else(|err| {
// TODO should this be changed to never fail?
// or should we instead add a diagnostic somewhere? But what would we return in this case?
tracing::error!(
"Failed to parse notebook '{path:?}: {err}'. Falling back to an empty notebook"
);
Notebook::from_source_code("").unwrap()
});
SourceKind::IpyNotebook(Arc::new(notebook))
}
};
Source { kind }
})
}
#[derive(Debug, Clone, PartialEq)]
pub enum SourceKind {
Python(Arc<str>),
Stub(Arc<str>),
IpyNotebook(Arc<Notebook>),
}
#[derive(Debug, Clone, PartialEq)]
pub struct Source {
kind: SourceKind,
}
impl Source {
pub fn python<T: Into<Arc<str>>>(source: T) -> Self {
Self {
kind: SourceKind::Python(source.into()),
}
}
pub fn kind(&self) -> &SourceKind {
&self.kind
}
pub fn text(&self) -> &str {
match &self.kind {
SourceKind::Python(text) => text,
SourceKind::Stub(text) => text,
SourceKind::IpyNotebook(notebook) => notebook.source_code(),
}
}
}
#[derive(Debug, Default)]
pub struct SourceStorage(pub(crate) KeyValueCache<FileId, Source>);
impl Deref for SourceStorage {
type Target = KeyValueCache<FileId, Source>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for SourceStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}

View File

@@ -0,0 +1,939 @@
#![allow(dead_code)]
use std::hash::{Hash, Hasher};
use std::iter::{Copied, DoubleEndedIterator, FusedIterator};
use std::num::NonZeroU32;
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use bitflags::bitflags;
use hashbrown::hash_map::{Keys, RawEntryMut};
use rustc_hash::{FxHashMap, FxHasher};
use ruff_index::{newtype_index, IndexVec};
use ruff_python_ast as ast;
use ruff_python_ast::visitor::preorder::PreorderVisitor;
use crate::ast_ids::TypedNodeKey;
use crate::cache::KeyValueCache;
use crate::db::{HasJar, SemanticDb, SemanticJar};
use crate::files::FileId;
use crate::module::ModuleName;
use crate::Name;
#[allow(unreachable_pub)]
#[tracing::instrument(level = "debug", skip(db))]
pub fn symbol_table<Db>(db: &Db, file_id: FileId) -> Arc<SymbolTable>
where
Db: SemanticDb + HasJar<SemanticJar>,
{
let jar = db.jar();
jar.symbol_tables.get(&file_id, |_| {
let parsed = db.parse(file_id);
Arc::from(SymbolTable::from_ast(parsed.ast()))
})
}
type Map<K, V> = hashbrown::HashMap<K, V, ()>;
#[newtype_index]
pub(crate) struct ScopeId;
impl ScopeId {
pub(crate) fn scope(self, table: &SymbolTable) -> &Scope {
&table.scopes_by_id[self]
}
}
#[newtype_index]
pub struct SymbolId;
impl SymbolId {
pub(crate) fn symbol(self, table: &SymbolTable) -> &Symbol {
&table.symbols_by_id[self]
}
}
#[derive(Copy, Clone, Debug, PartialEq)]
pub(crate) enum ScopeKind {
Module,
Annotation,
Class,
Function,
}
#[derive(Debug)]
pub(crate) struct Scope {
name: Name,
kind: ScopeKind,
child_scopes: Vec<ScopeId>,
// symbol IDs, hashed by symbol name
symbols_by_name: Map<SymbolId, ()>,
}
impl Scope {
pub(crate) fn name(&self) -> &str {
self.name.as_str()
}
pub(crate) fn kind(&self) -> ScopeKind {
self.kind
}
}
#[derive(Debug)]
pub(crate) enum Kind {
FreeVar,
CellVar,
CellVarAssigned,
ExplicitGlobal,
ImplicitGlobal,
}
bitflags! {
#[derive(Copy,Clone,Debug)]
pub(crate) struct SymbolFlags: u8 {
const IS_USED = 1 << 0;
const IS_DEFINED = 1 << 1;
/// TODO: This flag is not yet set by anything
const MARKED_GLOBAL = 1 << 2;
/// TODO: This flag is not yet set by anything
const MARKED_NONLOCAL = 1 << 3;
}
}
#[derive(Debug)]
pub(crate) struct Symbol {
name: Name,
flags: SymbolFlags,
// kind: Kind,
}
impl Symbol {
pub(crate) fn name(&self) -> &str {
self.name.as_str()
}
/// Is the symbol used in its containing scope?
pub(crate) fn is_used(&self) -> bool {
self.flags.contains(SymbolFlags::IS_USED)
}
/// Is the symbol defined in its containing scope?
pub(crate) fn is_defined(&self) -> bool {
self.flags.contains(SymbolFlags::IS_DEFINED)
}
// TODO: implement Symbol.kind 2-pass analysis to categorize as: free-var, cell-var,
// explicit-global, implicit-global and implement Symbol.kind by modifying the preorder
// traversal code
}
// TODO storing TypedNodeKey for definitions means we have to search to find them again in the AST;
// this is at best O(log n). If looking up definitions is a bottleneck we should look for
// alternatives here.
#[derive(Clone, Debug)]
pub(crate) enum Definition {
// For the import cases, we don't need reference to any arbitrary AST subtrees (annotations,
// RHS), and referencing just the import statement node is imprecise (a single import statement
// can assign many symbols, we'd have to re-search for the one we care about), so we just copy
// the small amount of information we need from the AST.
Import(ImportDefinition),
ImportFrom(ImportFromDefinition),
ClassDef(TypedNodeKey<ast::StmtClassDef>),
FunctionDef(TypedNodeKey<ast::StmtFunctionDef>),
Assignment(TypedNodeKey<ast::StmtAssign>),
AnnotatedAssignment(TypedNodeKey<ast::StmtAnnAssign>),
// TODO with statements, except handlers, function args...
}
#[derive(Clone, Debug)]
pub(crate) struct ImportDefinition {
pub(crate) module: ModuleName,
}
#[derive(Clone, Debug)]
pub(crate) struct ImportFromDefinition {
pub(crate) module: Option<ModuleName>,
pub(crate) name: Name,
pub(crate) level: u32,
}
impl ImportFromDefinition {
pub(crate) fn module(&self) -> Option<&ModuleName> {
self.module.as_ref()
}
pub(crate) fn name(&self) -> &Name {
&self.name
}
pub(crate) fn level(&self) -> u32 {
self.level
}
}
#[derive(Debug, Clone)]
pub enum Dependency {
Module(ModuleName),
Relative {
level: NonZeroU32,
module: Option<ModuleName>,
},
}
/// Table of all symbols in all scopes for a module.
#[derive(Debug)]
pub struct SymbolTable {
scopes_by_id: IndexVec<ScopeId, Scope>,
symbols_by_id: IndexVec<SymbolId, Symbol>,
defs: FxHashMap<SymbolId, Vec<Definition>>,
dependencies: Vec<Dependency>,
}
impl SymbolTable {
pub(crate) fn from_ast(module: &ast::ModModule) -> Self {
let root_scope_id = SymbolTable::root_scope_id();
let mut builder = SymbolTableBuilder {
table: SymbolTable::new(),
scopes: vec![root_scope_id],
current_definition: None,
};
builder.visit_body(&module.body);
builder.table
}
pub(crate) fn new() -> Self {
let mut table = SymbolTable {
scopes_by_id: IndexVec::new(),
symbols_by_id: IndexVec::new(),
defs: FxHashMap::default(),
dependencies: Vec::new(),
};
table.scopes_by_id.push(Scope {
name: Name::new("<module>"),
kind: ScopeKind::Module,
child_scopes: Vec::new(),
symbols_by_name: Map::default(),
});
table
}
pub(crate) fn dependencies(&self) -> &[Dependency] {
&self.dependencies
}
pub(crate) const fn root_scope_id() -> ScopeId {
ScopeId::from_usize(0)
}
pub(crate) fn root_scope(&self) -> &Scope {
&self.scopes_by_id[SymbolTable::root_scope_id()]
}
pub(crate) fn symbol_ids_for_scope(&self, scope_id: ScopeId) -> Copied<Keys<SymbolId, ()>> {
self.scopes_by_id[scope_id].symbols_by_name.keys().copied()
}
pub(crate) fn symbols_for_scope(
&self,
scope_id: ScopeId,
) -> SymbolIterator<Copied<Keys<SymbolId, ()>>> {
SymbolIterator {
table: self,
ids: self.symbol_ids_for_scope(scope_id),
}
}
pub(crate) fn root_symbol_ids(&self) -> Copied<Keys<SymbolId, ()>> {
self.symbol_ids_for_scope(SymbolTable::root_scope_id())
}
pub(crate) fn root_symbols(&self) -> SymbolIterator<Copied<Keys<SymbolId, ()>>> {
self.symbols_for_scope(SymbolTable::root_scope_id())
}
pub(crate) fn child_scope_ids_of(&self, scope_id: ScopeId) -> &[ScopeId] {
&self.scopes_by_id[scope_id].child_scopes
}
pub(crate) fn child_scopes_of(&self, scope_id: ScopeId) -> ScopeIterator<&[ScopeId]> {
ScopeIterator {
table: self,
ids: self.child_scope_ids_of(scope_id),
}
}
pub(crate) fn root_child_scope_ids(&self) -> &[ScopeId] {
self.child_scope_ids_of(SymbolTable::root_scope_id())
}
pub(crate) fn root_child_scopes(&self) -> ScopeIterator<&[ScopeId]> {
self.child_scopes_of(SymbolTable::root_scope_id())
}
pub(crate) fn symbol_id_by_name(&self, scope_id: ScopeId, name: &str) -> Option<SymbolId> {
let scope = &self.scopes_by_id[scope_id];
let hash = SymbolTable::hash_name(name);
let name = Name::new(name);
scope
.symbols_by_name
.raw_entry()
.from_hash(hash, |symid| self.symbols_by_id[*symid].name == name)
.map(|(symbol_id, ())| *symbol_id)
}
pub(crate) fn symbol_by_name(&self, scope_id: ScopeId, name: &str) -> Option<&Symbol> {
Some(&self.symbols_by_id[self.symbol_id_by_name(scope_id, name)?])
}
pub(crate) fn root_symbol_id_by_name(&self, name: &str) -> Option<SymbolId> {
self.symbol_id_by_name(SymbolTable::root_scope_id(), name)
}
pub(crate) fn root_symbol_by_name(&self, name: &str) -> Option<&Symbol> {
self.symbol_by_name(SymbolTable::root_scope_id(), name)
}
pub(crate) fn definitions(&self, symbol_id: SymbolId) -> &[Definition] {
self.defs
.get(&symbol_id)
.map(std::vec::Vec::as_slice)
.unwrap_or_default()
}
pub(crate) fn all_definitions(&self) -> impl Iterator<Item = (SymbolId, &Definition)> + '_ {
self.defs
.iter()
.flat_map(|(sym_id, defs)| defs.iter().map(move |def| (*sym_id, def)))
}
fn add_or_update_symbol(
&mut self,
scope_id: ScopeId,
name: &str,
flags: SymbolFlags,
) -> SymbolId {
let hash = SymbolTable::hash_name(name);
let scope = &mut self.scopes_by_id[scope_id];
let name = Name::new(name);
let entry = scope
.symbols_by_name
.raw_entry_mut()
.from_hash(hash, |existing| self.symbols_by_id[*existing].name == name);
match entry {
RawEntryMut::Occupied(entry) => {
if let Some(symbol) = self.symbols_by_id.get_mut(*entry.key()) {
symbol.flags.insert(flags);
};
*entry.key()
}
RawEntryMut::Vacant(entry) => {
let id = self.symbols_by_id.push(Symbol { name, flags });
entry.insert_with_hasher(hash, id, (), |_| hash);
id
}
}
}
fn add_child_scope(
&mut self,
parent_scope_id: ScopeId,
name: &str,
kind: ScopeKind,
) -> ScopeId {
let new_scope_id = self.scopes_by_id.push(Scope {
name: Name::new(name),
kind,
child_scopes: Vec::new(),
symbols_by_name: Map::default(),
});
let parent_scope = &mut self.scopes_by_id[parent_scope_id];
parent_scope.child_scopes.push(new_scope_id);
new_scope_id
}
fn hash_name(name: &str) -> u64 {
let mut hasher = FxHasher::default();
name.hash(&mut hasher);
hasher.finish()
}
}
pub(crate) struct SymbolIterator<'a, I> {
table: &'a SymbolTable,
ids: I,
}
impl<'a, I> Iterator for SymbolIterator<'a, I>
where
I: Iterator<Item = SymbolId>,
{
type Item = &'a Symbol;
fn next(&mut self) -> Option<Self::Item> {
let id = self.ids.next()?;
Some(&self.table.symbols_by_id[id])
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.ids.size_hint()
}
}
impl<'a, I> FusedIterator for SymbolIterator<'a, I> where
I: Iterator<Item = SymbolId> + FusedIterator
{
}
impl<'a, I> DoubleEndedIterator for SymbolIterator<'a, I>
where
I: Iterator<Item = SymbolId> + DoubleEndedIterator,
{
fn next_back(&mut self) -> Option<Self::Item> {
let id = self.ids.next_back()?;
Some(&self.table.symbols_by_id[id])
}
}
pub(crate) struct ScopeIterator<'a, I> {
table: &'a SymbolTable,
ids: I,
}
impl<'a, I> Iterator for ScopeIterator<'a, I>
where
I: Iterator<Item = ScopeId>,
{
type Item = &'a Scope;
fn next(&mut self) -> Option<Self::Item> {
let id = self.ids.next()?;
Some(&self.table.scopes_by_id[id])
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.ids.size_hint()
}
}
impl<'a, I> FusedIterator for ScopeIterator<'a, I> where I: Iterator<Item = ScopeId> + FusedIterator {}
impl<'a, I> DoubleEndedIterator for ScopeIterator<'a, I>
where
I: Iterator<Item = ScopeId> + DoubleEndedIterator,
{
fn next_back(&mut self) -> Option<Self::Item> {
let id = self.ids.next_back()?;
Some(&self.table.scopes_by_id[id])
}
}
struct SymbolTableBuilder {
table: SymbolTable,
scopes: Vec<ScopeId>,
/// the definition whose target(s) we are currently walking
current_definition: Option<Definition>,
}
impl SymbolTableBuilder {
fn add_or_update_symbol(&mut self, identifier: &str, flags: SymbolFlags) -> SymbolId {
self.table
.add_or_update_symbol(self.cur_scope(), identifier, flags)
}
fn add_or_update_symbol_with_def(
&mut self,
identifier: &str,
definition: Definition,
) -> SymbolId {
let symbol_id = self.add_or_update_symbol(identifier, SymbolFlags::IS_DEFINED);
self.table
.defs
.entry(symbol_id)
.or_default()
.push(definition);
symbol_id
}
fn push_scope(&mut self, child_of: ScopeId, name: &str, kind: ScopeKind) -> ScopeId {
let scope_id = self.table.add_child_scope(child_of, name, kind);
self.scopes.push(scope_id);
scope_id
}
fn pop_scope(&mut self) -> ScopeId {
self.scopes
.pop()
.expect("Scope stack should never be empty")
}
fn cur_scope(&self) -> ScopeId {
*self
.scopes
.last()
.expect("Scope stack should never be empty")
}
fn with_type_params(
&mut self,
name: &str,
params: &Option<Box<ast::TypeParams>>,
nested: impl FnOnce(&mut Self),
) {
if let Some(type_params) = params {
self.push_scope(self.cur_scope(), name, ScopeKind::Annotation);
for type_param in &type_params.type_params {
let name = match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar { name, .. }) => name,
ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { name, .. }) => name,
ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { name, .. }) => name,
};
self.add_or_update_symbol(name, SymbolFlags::IS_DEFINED);
}
}
nested(self);
if params.is_some() {
self.pop_scope();
}
}
}
impl PreorderVisitor<'_> for SymbolTableBuilder {
fn visit_expr(&mut self, expr: &ast::Expr) {
if let ast::Expr::Name(ast::ExprName { id, ctx, .. }) = expr {
let flags = match ctx {
ast::ExprContext::Load => SymbolFlags::IS_USED,
ast::ExprContext::Store => SymbolFlags::IS_DEFINED,
ast::ExprContext::Del => SymbolFlags::IS_DEFINED,
ast::ExprContext::Invalid => SymbolFlags::empty(),
};
self.add_or_update_symbol(id, flags);
if flags.contains(SymbolFlags::IS_DEFINED) {
if let Some(curdef) = self.current_definition.clone() {
self.add_or_update_symbol_with_def(id, curdef);
}
}
}
ast::visitor::preorder::walk_expr(self, expr);
}
fn visit_stmt(&mut self, stmt: &ast::Stmt) {
// TODO need to capture more definition statements here
match stmt {
ast::Stmt::ClassDef(node) => {
let def = Definition::ClassDef(TypedNodeKey::from_node(node));
self.add_or_update_symbol_with_def(&node.name, def);
self.with_type_params(&node.name, &node.type_params, |builder| {
builder.push_scope(builder.cur_scope(), &node.name, ScopeKind::Class);
ast::visitor::preorder::walk_stmt(builder, stmt);
builder.pop_scope();
});
}
ast::Stmt::FunctionDef(node) => {
let def = Definition::FunctionDef(TypedNodeKey::from_node(node));
self.add_or_update_symbol_with_def(&node.name, def);
self.with_type_params(&node.name, &node.type_params, |builder| {
builder.push_scope(builder.cur_scope(), &node.name, ScopeKind::Function);
ast::visitor::preorder::walk_stmt(builder, stmt);
builder.pop_scope();
});
}
ast::Stmt::Import(ast::StmtImport { names, .. }) => {
for alias in names {
let symbol_name = if let Some(asname) = &alias.asname {
asname.id.as_str()
} else {
alias.name.id.split('.').next().unwrap()
};
let module = ModuleName::new(&alias.name.id);
let def = Definition::Import(ImportDefinition {
module: module.clone(),
});
self.add_or_update_symbol_with_def(symbol_name, def);
self.table.dependencies.push(Dependency::Module(module));
}
}
ast::Stmt::ImportFrom(ast::StmtImportFrom {
module,
names,
level,
..
}) => {
let module = module.as_ref().map(|m| ModuleName::new(&m.id));
for alias in names {
let symbol_name = if let Some(asname) = &alias.asname {
asname.id.as_str()
} else {
alias.name.id.as_str()
};
let def = Definition::ImportFrom(ImportFromDefinition {
module: module.clone(),
name: Name::new(&alias.name.id),
level: *level,
});
self.add_or_update_symbol_with_def(symbol_name, def);
}
let dependency = if let Some(module) = module {
match NonZeroU32::new(*level) {
Some(level) => Dependency::Relative {
level,
module: Some(module),
},
None => Dependency::Module(module),
}
} else {
Dependency::Relative {
level: NonZeroU32::new(*level)
.expect("Import without a module to have a level > 0"),
module,
}
};
self.table.dependencies.push(dependency);
}
ast::Stmt::Assign(node) => {
debug_assert!(self.current_definition.is_none());
self.current_definition =
Some(Definition::Assignment(TypedNodeKey::from_node(node)));
ast::visitor::preorder::walk_stmt(self, stmt);
self.current_definition = None;
}
_ => {
ast::visitor::preorder::walk_stmt(self, stmt);
}
}
}
}
#[derive(Debug, Default)]
pub struct SymbolTablesStorage(KeyValueCache<FileId, Arc<SymbolTable>>);
impl Deref for SymbolTablesStorage {
type Target = KeyValueCache<FileId, Arc<SymbolTable>>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for SymbolTablesStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
#[cfg(test)]
mod tests {
use textwrap::dedent;
use crate::parse::Parsed;
use crate::symbols::ScopeKind;
use super::{SymbolFlags, SymbolId, SymbolIterator, SymbolTable};
mod from_ast {
use super::*;
fn parse(code: &str) -> Parsed {
Parsed::from_text(&dedent(code))
}
fn names<I>(it: SymbolIterator<I>) -> Vec<&str>
where
I: Iterator<Item = SymbolId>,
{
let mut symbols: Vec<_> = it.map(|sym| sym.name.as_str()).collect();
symbols.sort_unstable();
symbols
}
#[test]
fn empty() {
let parsed = parse("");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()).len(), 0);
}
#[test]
fn simple() {
let parsed = parse("x");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("x").unwrap())
.len(),
0
);
}
#[test]
fn annotation_only() {
let parsed = parse("x: int");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["int", "x"]);
// TODO record definition
}
#[test]
fn import() {
let parsed = parse("import foo");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("foo").unwrap())
.len(),
1
);
}
#[test]
fn import_sub() {
let parsed = parse("import foo.bar");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo"]);
}
#[test]
fn import_as() {
let parsed = parse("import foo.bar as baz");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["baz"]);
}
#[test]
fn import_from() {
let parsed = parse("from bar import foo");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("foo").unwrap())
.len(),
1
);
assert!(
table.root_symbol_id_by_name("foo").is_some_and(|sid| {
let s = sid.symbol(&table);
s.is_defined() || !s.is_used()
}),
"symbols that are defined get the defined flag"
);
}
#[test]
fn assign() {
let parsed = parse("x = foo");
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["foo", "x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("x").unwrap())
.len(),
1
);
assert!(
table.root_symbol_id_by_name("foo").is_some_and(|sid| {
let s = sid.symbol(&table);
!s.is_defined() && s.is_used()
}),
"a symbol used but not defined in a scope should have only the used flag"
);
}
#[test]
fn class_scope() {
let parsed = parse(
"
class C:
x = 1
y = 2
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["C", "y"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let c_scope = scopes[0].scope(&table);
assert_eq!(c_scope.kind(), ScopeKind::Class);
assert_eq!(c_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("C").unwrap())
.len(),
1
);
}
#[test]
fn func_scope() {
let parsed = parse(
"
def func():
x = 1
y = 2
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["func", "y"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let func_scope = scopes[0].scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Function);
assert_eq!(func_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("func").unwrap())
.len(),
1
);
}
#[test]
fn dupes() {
let parsed = parse(
"
def func():
x = 1
def func():
y = 2
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["func"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 2);
let func_scope_1 = scopes[0].scope(&table);
let func_scope_2 = scopes[1].scope(&table);
assert_eq!(func_scope_1.kind(), ScopeKind::Function);
assert_eq!(func_scope_1.name(), "func");
assert_eq!(func_scope_2.kind(), ScopeKind::Function);
assert_eq!(func_scope_2.name(), "func");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(names(table.symbols_for_scope(scopes[1])), vec!["y"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("func").unwrap())
.len(),
2
);
}
#[test]
fn generic_func() {
let parsed = parse(
"
def func[T]():
x = 1
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["func"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let ann_scope_id = scopes[0];
let ann_scope = ann_scope_id.scope(&table);
assert_eq!(ann_scope.kind(), ScopeKind::Annotation);
assert_eq!(ann_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(ann_scope_id)), vec!["T"]);
let scopes = table.child_scope_ids_of(ann_scope_id);
assert_eq!(scopes.len(), 1);
let func_scope_id = scopes[0];
let func_scope = func_scope_id.scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Function);
assert_eq!(func_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(func_scope_id)), vec!["x"]);
}
#[test]
fn generic_class() {
let parsed = parse(
"
class C[T]:
x = 1
",
);
let table = SymbolTable::from_ast(parsed.ast());
assert_eq!(names(table.root_symbols()), vec!["C"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let ann_scope_id = scopes[0];
let ann_scope = ann_scope_id.scope(&table);
assert_eq!(ann_scope.kind(), ScopeKind::Annotation);
assert_eq!(ann_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(ann_scope_id)), vec!["T"]);
assert!(
table
.symbol_by_name(ann_scope_id, "T")
.is_some_and(|s| s.is_defined() && !s.is_used()),
"type parameters are defined by the scope that introduces them"
);
let scopes = table.child_scope_ids_of(ann_scope_id);
assert_eq!(scopes.len(), 1);
let func_scope_id = scopes[0];
let func_scope = func_scope_id.scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Class);
assert_eq!(func_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(func_scope_id)), vec!["x"]);
}
}
#[test]
fn insert_same_name_symbol_twice() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let symbol_id_1 = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::IS_DEFINED);
let symbol_id_2 = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::IS_USED);
assert_eq!(symbol_id_1, symbol_id_2);
assert!(symbol_id_1.symbol(&table).is_used(), "flags must merge");
assert!(symbol_id_1.symbol(&table).is_defined(), "flags must merge");
}
#[test]
fn insert_different_named_symbols() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let symbol_id_1 = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let symbol_id_2 = table.add_or_update_symbol(root_scope_id, "bar", SymbolFlags::empty());
assert_ne!(symbol_id_1, symbol_id_2);
}
#[test]
fn add_child_scope_with_symbol() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let foo_symbol_top = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let c_scope = table.add_child_scope(root_scope_id, "C", ScopeKind::Class);
let foo_symbol_inner = table.add_or_update_symbol(c_scope, "foo", SymbolFlags::empty());
assert_ne!(foo_symbol_top, foo_symbol_inner);
}
#[test]
fn scope_from_id() {
let table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let scope = root_scope_id.scope(&table);
assert_eq!(scope.name.as_str(), "<module>");
assert_eq!(scope.kind, ScopeKind::Module);
}
#[test]
fn symbol_from_id() {
let mut table = SymbolTable::new();
let root_scope_id = SymbolTable::root_scope_id();
let foo_symbol_id = table.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let symbol = foo_symbol_id.symbol(&table);
assert_eq!(symbol.name.as_str(), "foo");
}
}

View File

@@ -0,0 +1,560 @@
#![allow(dead_code)]
use crate::ast_ids::NodeKey;
use crate::files::FileId;
use crate::symbols::SymbolId;
use crate::{FxDashMap, FxIndexSet, Name};
use ruff_index::{newtype_index, IndexVec};
use rustc_hash::FxHashMap;
pub(crate) mod infer;
pub(crate) use infer::infer_symbol_type;
/// unique ID for a type
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
pub enum Type {
/// the dynamic or gradual type: a statically-unknown set of values
Any,
/// the empty set of values
Never,
/// unknown type (no annotation)
/// equivalent to Any, or to object in strict mode
Unknown,
/// name is not bound to any value
Unbound,
/// a specific function object
Function(FunctionTypeId),
/// a specific class object
Class(ClassTypeId),
/// the set of Python objects with the given class in their __class__'s method resolution order
Instance(ClassTypeId),
Union(UnionTypeId),
Intersection(IntersectionTypeId),
// TODO protocols, callable types, overloads, generics, type vars
}
impl Type {
fn display<'a>(&'a self, store: &'a TypeStore) -> DisplayType<'a> {
DisplayType { ty: self, store }
}
pub const fn is_unbound(&self) -> bool {
matches!(self, Type::Unbound)
}
pub const fn is_unknown(&self) -> bool {
matches!(self, Type::Unknown)
}
}
impl From<FunctionTypeId> for Type {
fn from(id: FunctionTypeId) -> Self {
Type::Function(id)
}
}
impl From<UnionTypeId> for Type {
fn from(id: UnionTypeId) -> Self {
Type::Union(id)
}
}
impl From<IntersectionTypeId> for Type {
fn from(id: IntersectionTypeId) -> Self {
Type::Intersection(id)
}
}
// TODO: currently calling `get_function` et al and holding on to the `FunctionTypeRef` will lock a
// shard of this dashmap, for as long as you hold the reference. This may be a problem. We could
// switch to having all the arenas hold Arc, or we could see if we can split up ModuleTypeStore,
// and/or give it inner mutability and finer-grained internal locking.
#[derive(Debug, Default)]
pub struct TypeStore {
modules: FxDashMap<FileId, ModuleTypeStore>,
}
impl TypeStore {
pub fn remove_module(&mut self, file_id: FileId) {
self.modules.remove(&file_id);
}
pub fn cache_symbol_type(&self, file_id: FileId, symbol_id: SymbolId, ty: Type) {
self.add_or_get_module(file_id)
.symbol_types
.insert(symbol_id, ty);
}
pub fn cache_node_type(&self, file_id: FileId, node_key: NodeKey, ty: Type) {
self.add_or_get_module(file_id)
.node_types
.insert(node_key, ty);
}
pub fn get_cached_symbol_type(&self, file_id: FileId, symbol_id: SymbolId) -> Option<Type> {
self.try_get_module(file_id)?
.symbol_types
.get(&symbol_id)
.copied()
}
pub fn get_cached_node_type(&self, file_id: FileId, node_key: &NodeKey) -> Option<Type> {
self.try_get_module(file_id)?
.node_types
.get(node_key)
.copied()
}
fn add_or_get_module(&self, file_id: FileId) -> ModuleStoreRefMut {
self.modules
.entry(file_id)
.or_insert_with(|| ModuleTypeStore::new(file_id))
}
fn get_module(&self, file_id: FileId) -> ModuleStoreRef {
self.try_get_module(file_id).expect("module should exist")
}
fn try_get_module(&self, file_id: FileId) -> Option<ModuleStoreRef> {
self.modules.get(&file_id)
}
fn add_function(&self, file_id: FileId, name: &str) -> FunctionTypeId {
self.add_or_get_module(file_id).add_function(name)
}
fn add_class(&self, file_id: FileId, name: &str, bases: Vec<Type>) -> ClassTypeId {
self.add_or_get_module(file_id).add_class(name, bases)
}
fn add_union(&mut self, file_id: FileId, elems: &[Type]) -> UnionTypeId {
self.add_or_get_module(file_id).add_union(elems)
}
fn add_intersection(
&mut self,
file_id: FileId,
positive: &[Type],
negative: &[Type],
) -> IntersectionTypeId {
self.add_or_get_module(file_id)
.add_intersection(positive, negative)
}
fn get_function(&self, id: FunctionTypeId) -> FunctionTypeRef {
FunctionTypeRef {
module_store: self.get_module(id.file_id),
function_id: id.func_id,
}
}
fn get_class(&self, id: ClassTypeId) -> ClassTypeRef {
ClassTypeRef {
module_store: self.get_module(id.file_id),
class_id: id.class_id,
}
}
fn get_union(&self, id: UnionTypeId) -> UnionTypeRef {
UnionTypeRef {
module_store: self.get_module(id.file_id),
union_id: id.union_id,
}
}
fn get_intersection(&self, id: IntersectionTypeId) -> IntersectionTypeRef {
IntersectionTypeRef {
module_store: self.get_module(id.file_id),
intersection_id: id.intersection_id,
}
}
}
type ModuleStoreRef<'a> = dashmap::mapref::one::Ref<
'a,
FileId,
ModuleTypeStore,
std::hash::BuildHasherDefault<rustc_hash::FxHasher>,
>;
type ModuleStoreRefMut<'a> = dashmap::mapref::one::RefMut<
'a,
FileId,
ModuleTypeStore,
std::hash::BuildHasherDefault<rustc_hash::FxHasher>,
>;
#[derive(Debug)]
pub(crate) struct FunctionTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
function_id: ModuleFunctionTypeId,
}
impl<'a> std::ops::Deref for FunctionTypeRef<'a> {
type Target = FunctionType;
fn deref(&self) -> &Self::Target {
self.module_store.get_function(self.function_id)
}
}
#[derive(Debug)]
pub(crate) struct ClassTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
class_id: ModuleClassTypeId,
}
impl<'a> std::ops::Deref for ClassTypeRef<'a> {
type Target = ClassType;
fn deref(&self) -> &Self::Target {
self.module_store.get_class(self.class_id)
}
}
#[derive(Debug)]
pub(crate) struct UnionTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
union_id: ModuleUnionTypeId,
}
impl<'a> std::ops::Deref for UnionTypeRef<'a> {
type Target = UnionType;
fn deref(&self) -> &Self::Target {
self.module_store.get_union(self.union_id)
}
}
#[derive(Debug)]
pub(crate) struct IntersectionTypeRef<'a> {
module_store: ModuleStoreRef<'a>,
intersection_id: ModuleIntersectionTypeId,
}
impl<'a> std::ops::Deref for IntersectionTypeRef<'a> {
type Target = IntersectionType;
fn deref(&self) -> &Self::Target {
self.module_store.get_intersection(self.intersection_id)
}
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct FunctionTypeId {
file_id: FileId,
func_id: ModuleFunctionTypeId,
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct ClassTypeId {
file_id: FileId,
class_id: ModuleClassTypeId,
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct UnionTypeId {
file_id: FileId,
union_id: ModuleUnionTypeId,
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct IntersectionTypeId {
file_id: FileId,
intersection_id: ModuleIntersectionTypeId,
}
#[newtype_index]
struct ModuleFunctionTypeId;
#[newtype_index]
struct ModuleClassTypeId;
#[newtype_index]
struct ModuleUnionTypeId;
#[newtype_index]
struct ModuleIntersectionTypeId;
#[derive(Debug)]
struct ModuleTypeStore {
file_id: FileId,
/// arena of all function types defined in this module
functions: IndexVec<ModuleFunctionTypeId, FunctionType>,
/// arena of all class types defined in this module
classes: IndexVec<ModuleClassTypeId, ClassType>,
/// arenda of all union types created in this module
unions: IndexVec<ModuleUnionTypeId, UnionType>,
/// arena of all intersection types created in this module
intersections: IndexVec<ModuleIntersectionTypeId, IntersectionType>,
/// cached types of symbols in this module
symbol_types: FxHashMap<SymbolId, Type>,
/// cached types of AST nodes in this module
node_types: FxHashMap<NodeKey, Type>,
}
impl ModuleTypeStore {
fn new(file_id: FileId) -> Self {
Self {
file_id,
functions: IndexVec::default(),
classes: IndexVec::default(),
unions: IndexVec::default(),
intersections: IndexVec::default(),
symbol_types: FxHashMap::default(),
node_types: FxHashMap::default(),
}
}
fn add_function(&mut self, name: &str) -> FunctionTypeId {
let func_id = self.functions.push(FunctionType {
name: Name::new(name),
});
FunctionTypeId {
file_id: self.file_id,
func_id,
}
}
fn add_class(&mut self, name: &str, bases: Vec<Type>) -> ClassTypeId {
let class_id = self.classes.push(ClassType {
name: Name::new(name),
// TODO: if no bases are given, that should imply [object]
bases,
});
ClassTypeId {
file_id: self.file_id,
class_id,
}
}
fn add_union(&mut self, elems: &[Type]) -> UnionTypeId {
let union_id = self.unions.push(UnionType {
elements: elems.iter().copied().collect(),
});
UnionTypeId {
file_id: self.file_id,
union_id,
}
}
fn add_intersection(&mut self, positive: &[Type], negative: &[Type]) -> IntersectionTypeId {
let intersection_id = self.intersections.push(IntersectionType {
positive: positive.iter().copied().collect(),
negative: negative.iter().copied().collect(),
});
IntersectionTypeId {
file_id: self.file_id,
intersection_id,
}
}
fn get_function(&self, func_id: ModuleFunctionTypeId) -> &FunctionType {
&self.functions[func_id]
}
fn get_class(&self, class_id: ModuleClassTypeId) -> &ClassType {
&self.classes[class_id]
}
fn get_union(&self, union_id: ModuleUnionTypeId) -> &UnionType {
&self.unions[union_id]
}
fn get_intersection(&self, intersection_id: ModuleIntersectionTypeId) -> &IntersectionType {
&self.intersections[intersection_id]
}
}
#[derive(Copy, Clone, Debug)]
struct DisplayType<'a> {
ty: &'a Type,
store: &'a TypeStore,
}
impl std::fmt::Display for DisplayType<'_> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self.ty {
Type::Any => f.write_str("Any"),
Type::Never => f.write_str("Never"),
Type::Unknown => f.write_str("Unknown"),
Type::Unbound => f.write_str("Unbound"),
// TODO functions and classes should display using a fully qualified name
Type::Class(class_id) => {
f.write_str("Literal[")?;
f.write_str(self.store.get_class(*class_id).name())?;
f.write_str("]")
}
Type::Instance(class_id) => f.write_str(self.store.get_class(*class_id).name()),
Type::Function(func_id) => f.write_str(self.store.get_function(*func_id).name()),
Type::Union(union_id) => self
.store
.get_module(union_id.file_id)
.get_union(union_id.union_id)
.display(f, self.store),
Type::Intersection(int_id) => self
.store
.get_module(int_id.file_id)
.get_intersection(int_id.intersection_id)
.display(f, self.store),
}
}
}
#[derive(Debug)]
pub(crate) struct ClassType {
name: Name,
bases: Vec<Type>,
}
impl ClassType {
fn name(&self) -> &str {
self.name.as_str()
}
fn bases(&self) -> &[Type] {
self.bases.as_slice()
}
}
#[derive(Debug)]
pub(crate) struct FunctionType {
name: Name,
}
impl FunctionType {
fn name(&self) -> &str {
self.name.as_str()
}
}
#[derive(Debug)]
pub(crate) struct UnionType {
// the union type includes values in any of these types
elements: FxIndexSet<Type>,
}
impl UnionType {
fn display(&self, f: &mut std::fmt::Formatter<'_>, store: &TypeStore) -> std::fmt::Result {
f.write_str("(")?;
let mut first = true;
for ty in &self.elements {
if !first {
f.write_str(" | ")?;
};
first = false;
write!(f, "{}", ty.display(store))?;
}
f.write_str(")")
}
}
// Negation types aren't expressible in annotations, and are most likely to arise from type
// narrowing along with intersections (e.g. `if not isinstance(...)`), so we represent them
// directly in intersections rather than as a separate type. This sacrifices some efficiency in the
// case where a Not appears outside an intersection (unclear when that could even happen, but we'd
// have to represent it as a single-element intersection if it did) in exchange for better
// efficiency in the not-within-intersection case.
#[derive(Debug)]
pub(crate) struct IntersectionType {
// the intersection type includes only values in all of these types
positive: FxIndexSet<Type>,
// negated elements of the intersection, e.g.
negative: FxIndexSet<Type>,
}
impl IntersectionType {
fn display(&self, f: &mut std::fmt::Formatter<'_>, store: &TypeStore) -> std::fmt::Result {
f.write_str("(")?;
let mut first = true;
for (neg, ty) in self
.positive
.iter()
.map(|ty| (false, ty))
.chain(self.negative.iter().map(|ty| (true, ty)))
{
if !first {
f.write_str(" & ")?;
};
first = false;
if neg {
f.write_str("~")?;
};
write!(f, "{}", ty.display(store))?;
}
f.write_str(")")
}
}
#[cfg(test)]
mod tests {
use crate::files::Files;
use crate::types::{Type, TypeStore};
use crate::FxIndexSet;
use std::path::Path;
#[test]
fn add_class() {
let store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let id = store.add_class(file_id, "C", Vec::new());
assert_eq!(store.get_class(id).name(), "C");
let inst = Type::Instance(id);
assert_eq!(format!("{}", inst.display(&store)), "C");
}
#[test]
fn add_function() {
let store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let id = store.add_function(file_id, "func");
assert_eq!(store.get_function(id).name(), "func");
let func = Type::Function(id);
assert_eq!(format!("{}", func.display(&store)), "func");
}
#[test]
fn add_union() {
let mut store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let c1 = store.add_class(file_id, "C1", Vec::new());
let c2 = store.add_class(file_id, "C2", Vec::new());
let elems = vec![Type::Instance(c1), Type::Instance(c2)];
let id = store.add_union(file_id, &elems);
assert_eq!(
store.get_union(id).elements,
elems.into_iter().collect::<FxIndexSet<_>>()
);
let union = Type::Union(id);
assert_eq!(format!("{}", union.display(&store)), "(C1 | C2)");
}
#[test]
fn add_intersection() {
let mut store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let c1 = store.add_class(file_id, "C1", Vec::new());
let c2 = store.add_class(file_id, "C2", Vec::new());
let c3 = store.add_class(file_id, "C3", Vec::new());
let pos = vec![Type::Instance(c1), Type::Instance(c2)];
let neg = vec![Type::Instance(c3)];
let id = store.add_intersection(file_id, &pos, &neg);
assert_eq!(
store.get_intersection(id).positive,
pos.into_iter().collect::<FxIndexSet<_>>()
);
assert_eq!(
store.get_intersection(id).negative,
neg.into_iter().collect::<FxIndexSet<_>>()
);
let intersection = Type::Intersection(id);
assert_eq!(
format!("{}", intersection.display(&store)),
"(C1 & C2 & ~C3)"
);
}
}

View File

@@ -0,0 +1,219 @@
#![allow(dead_code)]
use ruff_python_ast::AstNode;
use crate::db::{HasJar, SemanticDb, SemanticJar};
use crate::module::ModuleName;
use crate::symbols::{Definition, ImportFromDefinition, SymbolId};
use crate::types::Type;
use crate::FileId;
use ruff_python_ast as ast;
// FIXME: Figure out proper dead-lock free synchronisation now that this takes `&db` instead of `&mut db`.
#[tracing::instrument(level = "trace", skip(db))]
pub fn infer_symbol_type<Db>(db: &Db, file_id: FileId, symbol_id: SymbolId) -> Type
where
Db: SemanticDb + HasJar<SemanticJar>,
{
let symbols = db.symbol_table(file_id);
let defs = symbols.definitions(symbol_id);
if let Some(ty) = db
.jar()
.type_store
.get_cached_symbol_type(file_id, symbol_id)
{
return ty;
}
// TODO handle multiple defs, conditional defs...
assert_eq!(defs.len(), 1);
let ty = match &defs[0] {
Definition::ImportFrom(ImportFromDefinition {
module,
name,
level,
}) => {
// TODO relative imports
assert!(matches!(level, 0));
let module_name = ModuleName::new(module.as_ref().expect("TODO relative imports"));
if let Some(module) = db.resolve_module(module_name) {
let remote_file_id = module.path(db).file();
let remote_symbols = db.symbol_table(remote_file_id);
if let Some(remote_symbol_id) = remote_symbols.root_symbol_id_by_name(name) {
db.infer_symbol_type(remote_file_id, remote_symbol_id)
} else {
Type::Unknown
}
} else {
Type::Unknown
}
}
Definition::ClassDef(node_key) => db
.jar()
.type_store
.get_cached_node_type(file_id, node_key.erased())
.unwrap_or_else(|| {
let parsed = db.parse(file_id);
let ast = parsed.ast();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
let bases: Vec<_> = node
.bases()
.iter()
.map(|base_expr| infer_expr_type(db, file_id, base_expr))
.collect();
let store = &db.jar().type_store;
let ty = Type::Class(store.add_class(file_id, &node.name.id, bases));
store.cache_node_type(file_id, *node_key.erased(), ty);
ty
}),
Definition::FunctionDef(node_key) => db
.jar()
.type_store
.get_cached_node_type(file_id, node_key.erased())
.unwrap_or_else(|| {
let parsed = db.parse(file_id);
let ast = parsed.ast();
let node = node_key
.resolve(ast.as_any_node_ref())
.expect("node key should resolve");
let store = &db.jar().type_store;
let ty = store.add_function(file_id, &node.name.id).into();
store.cache_node_type(file_id, *node_key.erased(), ty);
ty
}),
Definition::Assignment(node_key) => {
let parsed = db.parse(file_id);
let ast = parsed.ast();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
// TODO handle unpacking assignment correctly
infer_expr_type(db, file_id, &node.value)
}
_ => todo!("other kinds of definitions"),
};
db.jar()
.type_store
.cache_symbol_type(file_id, symbol_id, ty);
// TODO record dependencies
ty
}
fn infer_expr_type<Db>(db: &Db, file_id: FileId, expr: &ast::Expr) -> Type
where
Db: SemanticDb + HasJar<SemanticJar>,
{
// TODO cache the resolution of the type on the node
let symbols = db.symbol_table(file_id);
match expr {
ast::Expr::Name(name) => {
if let Some(symbol_id) = symbols.root_symbol_id_by_name(&name.id) {
db.infer_symbol_type(file_id, symbol_id)
} else {
Type::Unknown
}
}
_ => todo!("full expression type resolution"),
}
}
#[cfg(test)]
mod tests {
use crate::db::tests::TestDb;
use crate::db::{HasJar, SemanticDb, SemanticJar};
use crate::module::{ModuleName, ModuleSearchPath, ModuleSearchPathKind};
use crate::types::Type;
// TODO with virtual filesystem we shouldn't have to write files to disk for these
// tests
struct TestCase {
temp_dir: tempfile::TempDir,
db: TestDb,
src: ModuleSearchPath,
}
fn create_test() -> std::io::Result<TestCase> {
let temp_dir = tempfile::tempdir()?;
let src = temp_dir.path().join("src");
std::fs::create_dir(&src)?;
let src = ModuleSearchPath::new(src.canonicalize()?, ModuleSearchPathKind::FirstParty);
let roots = vec![src.clone()];
let mut db = TestDb::default();
db.set_module_search_paths(roots);
Ok(TestCase { temp_dir, db, src })
}
#[test]
fn follow_import_to_class() -> std::io::Result<()> {
let case = create_test()?;
let db = &case.db;
let a_path = case.src.path().join("a.py");
let b_path = case.src.path().join("b.py");
std::fs::write(a_path, "from b import C as D; E = D")?;
std::fs::write(b_path, "class C: pass")?;
let a_file = db
.resolve_module(ModuleName::new("a"))
.expect("module should be found")
.path(db)
.file();
let a_syms = db.symbol_table(a_file);
let e_sym = a_syms
.root_symbol_id_by_name("E")
.expect("E symbol should be found");
let ty = db.infer_symbol_type(a_file, e_sym);
let jar = HasJar::<SemanticJar>::jar(db);
assert!(matches!(ty, Type::Class(_)));
assert_eq!(format!("{}", ty.display(&jar.type_store)), "Literal[C]");
Ok(())
}
#[test]
fn resolve_base_class_by_name() -> std::io::Result<()> {
let case = create_test()?;
let db = &case.db;
let path = case.src.path().join("mod.py");
std::fs::write(path, "class Base: pass\nclass Sub(Base): pass")?;
let file = db
.resolve_module(ModuleName::new("mod"))
.expect("module should be found")
.path(db)
.file();
let syms = db.symbol_table(file);
let sym = syms
.root_symbol_id_by_name("Sub")
.expect("Sub symbol should be found");
let ty = db.infer_symbol_type(file, sym);
let Type::Class(class_id) = ty else {
panic!("Sub is not a Class")
};
let jar = HasJar::<SemanticJar>::jar(db);
let base_names: Vec<_> = jar
.type_store
.get_class(class_id)
.bases()
.iter()
.map(|base_ty| format!("{}", base_ty.display(&jar.type_store)))
.collect();
assert_eq!(base_names, vec!["Literal[Base]"]);
Ok(())
}
}

View File

@@ -0,0 +1,78 @@
use anyhow::Context;
use std::path::Path;
use crate::files::Files;
use crate::program::{FileChange, FileChangeKind};
use notify::event::{CreateKind, RemoveKind};
use notify::{recommended_watcher, Event, EventKind, RecommendedWatcher, RecursiveMode, Watcher};
pub struct FileWatcher {
watcher: RecommendedWatcher,
}
pub trait EventHandler: Send + 'static {
fn handle(&self, changes: Vec<FileChange>);
}
impl<F> EventHandler for F
where
F: Fn(Vec<FileChange>) + Send + 'static,
{
fn handle(&self, changes: Vec<FileChange>) {
let f = self;
f(changes);
}
}
impl FileWatcher {
pub fn new<E>(handler: E, files: Files) -> anyhow::Result<Self>
where
E: EventHandler,
{
Self::from_handler(Box::new(handler), files)
}
fn from_handler(handler: Box<dyn EventHandler>, files: Files) -> anyhow::Result<Self> {
let watcher = recommended_watcher(move |changes: notify::Result<Event>| {
match changes {
Ok(event) => {
// TODO verify that this handles all events correctly
let change_kind = match event.kind {
EventKind::Create(CreateKind::File) => FileChangeKind::Created,
EventKind::Modify(_) => FileChangeKind::Modified,
EventKind::Remove(RemoveKind::File) => FileChangeKind::Deleted,
_ => {
return;
}
};
let mut changes = Vec::new();
for path in event.paths {
if path.is_file() {
let id = files.intern(&path);
changes.push(FileChange::new(id, change_kind));
}
}
if !changes.is_empty() {
handler.handle(changes);
}
}
// TODO proper error handling
Err(err) => {
panic!("Error: {err}");
}
}
})
.context("Failed to create file watcher.")?;
Ok(Self { watcher })
}
pub fn watch_folder(&mut self, path: &Path) -> anyhow::Result<()> {
self.watcher.watch(path, RecursiveMode::Recursive)?;
Ok(())
}
}

View File

@@ -1,6 +1,6 @@
[package]
name = "ruff"
version = "0.3.6"
version = "0.4.2"
publish = false
authors = { workspace = true }
edition = { workspace = true }

View File

@@ -190,7 +190,7 @@ pub struct CheckCommand {
pub output_format: Option<SerializationFormat>,
/// Specify file to write the linter output to (default: stdout).
#[arg(short, long)]
#[arg(short, long, env = "RUFF_OUTPUT_FILE")]
pub output_file: Option<PathBuf>,
/// The minimum Python version that should be supported.
#[arg(long, value_enum)]

View File

@@ -375,15 +375,17 @@ pub(crate) fn init(path: &Path) -> Result<()> {
fs::create_dir_all(path.join(VERSION))?;
// Add the CACHEDIR.TAG.
if !cachedir::is_tagged(path)? {
cachedir::add_tag(path)?;
}
cachedir::ensure_tag(path)?;
// Add the .gitignore.
let gitignore_path = path.join(".gitignore");
if !gitignore_path.exists() {
let mut file = fs::File::create(gitignore_path)?;
file.write_all(b"# Automatically created by ruff.\n*\n")?;
match fs::OpenOptions::new()
.write(true)
.create_new(true)
.open(path.join(".gitignore"))
{
Ok(mut file) => file.write_all(b"# Automatically created by ruff.\n*\n")?,
Err(err) if err.kind() == io::ErrorKind::AlreadyExists => (),
Err(err) => return Err(err.into()),
}
Ok(())

View File

@@ -523,7 +523,7 @@ from module import =
----- stdout -----
----- stderr -----
error: Failed to parse main.py:2:20: Unexpected token '='
error: Failed to parse main.py:2:20: Expected an import name
"###);
Ok(())

View File

@@ -731,11 +731,11 @@ fn stdin_parse_error() {
success: false
exit_code: 1
----- stdout -----
-:1:17: E999 SyntaxError: Unexpected token '='
-:1:17: E999 SyntaxError: Expected an import name
Found 1 error.
----- stderr -----
error: Failed to parse at 1:17: Unexpected token '='
error: Failed to parse at 1:17: Expected an import name
"###);
}

View File

@@ -1248,3 +1248,85 @@ fn negated_per_file_ignores_absolute() -> Result<()> {
});
Ok(())
}
/// patterns are additive, can't use negative patterns to "un-ignore"
#[test]
fn negated_per_file_ignores_overlap() -> Result<()> {
let tempdir = TempDir::new()?;
let ruff_toml = tempdir.path().join("ruff.toml");
fs::write(
&ruff_toml,
r#"
[lint.per-file-ignores]
"*.py" = ["RUF"]
"!foo.py" = ["RUF"]
"#,
)?;
let foo_file = tempdir.path().join("foo.py");
fs::write(foo_file, "")?;
let bar_file = tempdir.path().join("bar.py");
fs::write(bar_file, "")?;
assert_cmd_snapshot!(Command::new(get_cargo_bin(BIN_NAME))
.args(STDIN_BASE_OPTIONS)
.arg("--config")
.arg(&ruff_toml)
.arg("--select")
.arg("RUF901")
.current_dir(&tempdir)
, @r###"
success: true
exit_code: 0
----- stdout -----
All checks passed!
----- stderr -----
"###);
Ok(())
}
#[test]
fn unused_interaction() -> Result<()> {
let tempdir = TempDir::new()?;
let ruff_toml = tempdir.path().join("ruff.toml");
fs::write(
&ruff_toml,
r#"
[lint]
select = ["F"]
"#,
)?;
insta::with_settings!({
filters => vec![(tempdir_filter(&tempdir).as_str(), "[TMP]/")]
}, {
assert_cmd_snapshot!(Command::new(get_cargo_bin(BIN_NAME))
.args(STDIN_BASE_OPTIONS)
.arg("--config")
.arg(&ruff_toml)
.args(["--stdin-filename", "test.py"])
.arg("--fix")
.arg("-")
.pass_stdin(r#"
import os # F401
def function():
import os # F811
print(os.name)
"#), @r###"
success: true
exit_code: 0
----- stdout -----
import os # F401
def function():
print(os.name)
----- stderr -----
Found 1 error (1 fixed, 0 remaining).
"###);
});
Ok(())
}

View File

@@ -53,6 +53,7 @@ file_resolver.extend_exclude = [
"crates/ruff/resources/",
"crates/ruff_linter/resources/",
"crates/ruff_python_formatter/resources/",
"crates/ruff_python_parser/resources/",
]
file_resolver.force_exclude = false
file_resolver.include = [
@@ -68,128 +69,128 @@ file_resolver.project_root = "[BASEPATH]"
linter.exclude = []
linter.project_root = "[BASEPATH]"
linter.rules.enabled = [
MultipleImportsOnOneLine,
ModuleImportNotAtTopOfFile,
MultipleStatementsOnOneLineColon,
MultipleStatementsOnOneLineSemicolon,
UselessSemicolon,
NoneComparison,
TrueFalseComparison,
NotInTest,
NotIsTest,
TypeComparison,
BareExcept,
LambdaAssignment,
AmbiguousVariableName,
AmbiguousClassName,
AmbiguousFunctionName,
IOError,
SyntaxError,
UnusedImport,
ImportShadowedByLoopVar,
UndefinedLocalWithImportStar,
LateFutureImport,
UndefinedLocalWithImportStarUsage,
UndefinedLocalWithNestedImportStarUsage,
FutureFeatureNotDefined,
PercentFormatInvalidFormat,
PercentFormatExpectedMapping,
PercentFormatExpectedSequence,
PercentFormatExtraNamedArguments,
PercentFormatMissingArgument,
PercentFormatMixedPositionalAndNamed,
PercentFormatPositionalCountMismatch,
PercentFormatStarRequiresSequence,
PercentFormatUnsupportedFormatCharacter,
StringDotFormatInvalidFormat,
StringDotFormatExtraNamedArguments,
StringDotFormatExtraPositionalArguments,
StringDotFormatMissingArguments,
StringDotFormatMixingAutomatic,
FStringMissingPlaceholders,
MultiValueRepeatedKeyLiteral,
MultiValueRepeatedKeyVariable,
ExpressionsInStarAssignment,
MultipleStarredExpressions,
AssertTuple,
IsLiteral,
InvalidPrintSyntax,
IfTuple,
BreakOutsideLoop,
ContinueOutsideLoop,
YieldOutsideFunction,
ReturnOutsideFunction,
DefaultExceptNotLast,
ForwardAnnotationSyntaxError,
RedefinedWhileUnused,
UndefinedName,
UndefinedExport,
UndefinedLocal,
UnusedVariable,
UnusedAnnotation,
RaiseNotImplemented,
multiple-imports-on-one-line (E401),
module-import-not-at-top-of-file (E402),
multiple-statements-on-one-line-colon (E701),
multiple-statements-on-one-line-semicolon (E702),
useless-semicolon (E703),
none-comparison (E711),
true-false-comparison (E712),
not-in-test (E713),
not-is-test (E714),
type-comparison (E721),
bare-except (E722),
lambda-assignment (E731),
ambiguous-variable-name (E741),
ambiguous-class-name (E742),
ambiguous-function-name (E743),
io-error (E902),
syntax-error (E999),
unused-import (F401),
import-shadowed-by-loop-var (F402),
undefined-local-with-import-star (F403),
late-future-import (F404),
undefined-local-with-import-star-usage (F405),
undefined-local-with-nested-import-star-usage (F406),
future-feature-not-defined (F407),
percent-format-invalid-format (F501),
percent-format-expected-mapping (F502),
percent-format-expected-sequence (F503),
percent-format-extra-named-arguments (F504),
percent-format-missing-argument (F505),
percent-format-mixed-positional-and-named (F506),
percent-format-positional-count-mismatch (F507),
percent-format-star-requires-sequence (F508),
percent-format-unsupported-format-character (F509),
string-dot-format-invalid-format (F521),
string-dot-format-extra-named-arguments (F522),
string-dot-format-extra-positional-arguments (F523),
string-dot-format-missing-arguments (F524),
string-dot-format-mixing-automatic (F525),
f-string-missing-placeholders (F541),
multi-value-repeated-key-literal (F601),
multi-value-repeated-key-variable (F602),
expressions-in-star-assignment (F621),
multiple-starred-expressions (F622),
assert-tuple (F631),
is-literal (F632),
invalid-print-syntax (F633),
if-tuple (F634),
break-outside-loop (F701),
continue-outside-loop (F702),
yield-outside-function (F704),
return-outside-function (F706),
default-except-not-last (F707),
forward-annotation-syntax-error (F722),
redefined-while-unused (F811),
undefined-name (F821),
undefined-export (F822),
undefined-local (F823),
unused-variable (F841),
unused-annotation (F842),
raise-not-implemented (F901),
]
linter.rules.should_fix = [
MultipleImportsOnOneLine,
ModuleImportNotAtTopOfFile,
MultipleStatementsOnOneLineColon,
MultipleStatementsOnOneLineSemicolon,
UselessSemicolon,
NoneComparison,
TrueFalseComparison,
NotInTest,
NotIsTest,
TypeComparison,
BareExcept,
LambdaAssignment,
AmbiguousVariableName,
AmbiguousClassName,
AmbiguousFunctionName,
IOError,
SyntaxError,
UnusedImport,
ImportShadowedByLoopVar,
UndefinedLocalWithImportStar,
LateFutureImport,
UndefinedLocalWithImportStarUsage,
UndefinedLocalWithNestedImportStarUsage,
FutureFeatureNotDefined,
PercentFormatInvalidFormat,
PercentFormatExpectedMapping,
PercentFormatExpectedSequence,
PercentFormatExtraNamedArguments,
PercentFormatMissingArgument,
PercentFormatMixedPositionalAndNamed,
PercentFormatPositionalCountMismatch,
PercentFormatStarRequiresSequence,
PercentFormatUnsupportedFormatCharacter,
StringDotFormatInvalidFormat,
StringDotFormatExtraNamedArguments,
StringDotFormatExtraPositionalArguments,
StringDotFormatMissingArguments,
StringDotFormatMixingAutomatic,
FStringMissingPlaceholders,
MultiValueRepeatedKeyLiteral,
MultiValueRepeatedKeyVariable,
ExpressionsInStarAssignment,
MultipleStarredExpressions,
AssertTuple,
IsLiteral,
InvalidPrintSyntax,
IfTuple,
BreakOutsideLoop,
ContinueOutsideLoop,
YieldOutsideFunction,
ReturnOutsideFunction,
DefaultExceptNotLast,
ForwardAnnotationSyntaxError,
RedefinedWhileUnused,
UndefinedName,
UndefinedExport,
UndefinedLocal,
UnusedVariable,
UnusedAnnotation,
RaiseNotImplemented,
multiple-imports-on-one-line (E401),
module-import-not-at-top-of-file (E402),
multiple-statements-on-one-line-colon (E701),
multiple-statements-on-one-line-semicolon (E702),
useless-semicolon (E703),
none-comparison (E711),
true-false-comparison (E712),
not-in-test (E713),
not-is-test (E714),
type-comparison (E721),
bare-except (E722),
lambda-assignment (E731),
ambiguous-variable-name (E741),
ambiguous-class-name (E742),
ambiguous-function-name (E743),
io-error (E902),
syntax-error (E999),
unused-import (F401),
import-shadowed-by-loop-var (F402),
undefined-local-with-import-star (F403),
late-future-import (F404),
undefined-local-with-import-star-usage (F405),
undefined-local-with-nested-import-star-usage (F406),
future-feature-not-defined (F407),
percent-format-invalid-format (F501),
percent-format-expected-mapping (F502),
percent-format-expected-sequence (F503),
percent-format-extra-named-arguments (F504),
percent-format-missing-argument (F505),
percent-format-mixed-positional-and-named (F506),
percent-format-positional-count-mismatch (F507),
percent-format-star-requires-sequence (F508),
percent-format-unsupported-format-character (F509),
string-dot-format-invalid-format (F521),
string-dot-format-extra-named-arguments (F522),
string-dot-format-extra-positional-arguments (F523),
string-dot-format-missing-arguments (F524),
string-dot-format-mixing-automatic (F525),
f-string-missing-placeholders (F541),
multi-value-repeated-key-literal (F601),
multi-value-repeated-key-variable (F602),
expressions-in-star-assignment (F621),
multiple-starred-expressions (F622),
assert-tuple (F631),
is-literal (F632),
invalid-print-syntax (F633),
if-tuple (F634),
break-outside-loop (F701),
continue-outside-loop (F702),
yield-outside-function (F704),
return-outside-function (F706),
default-except-not-last (F707),
forward-annotation-syntax-error (F722),
redefined-while-unused (F811),
undefined-name (F821),
undefined-export (F822),
undefined-local (F823),
unused-variable (F841),
unused-annotation (F842),
raise-not-implemented (F901),
]
linter.per_file_ignores = {}
linter.safety_table.forced_safe = []

View File

@@ -11,9 +11,9 @@ repository = { workspace = true }
license = { workspace = true }
[dependencies]
itertools = { workspace = true }
glob = { workspace = true }
globset = { workspace = true }
itertools = { workspace = true }
regex = { workspace = true }
filetime = { workspace = true }
seahash = { workspace = true }

View File

@@ -65,7 +65,7 @@ use seahash::SeaHasher;
/// The main reason is that hashes and cache keys have different constraints:
///
/// * Cache keys are less performance sensitive: Hashes must be super fast to compute for performant hashed-collections. That's
/// why some standard types don't implement [`Hash`] where it would be safe to to implement [`CacheKey`], e.g. `HashSet`
/// why some standard types don't implement [`Hash`] where it would be safe to implement [`CacheKey`], e.g. `HashSet`
/// * Cache keys must be deterministic where hash keys do not have this constraint. That's why pointers don't implement [`CacheKey`] but they implement [`Hash`].
/// * Ideally, cache keys are portable
///

View File

@@ -138,7 +138,7 @@ pub const fn empty_line() -> Line {
///
/// # Examples
///
/// The line breaks are emitted as spaces if the enclosing `Group` fits on a a single line:
/// The line breaks are emitted as spaces if the enclosing `Group` fits on a single line:
/// ```
/// use ruff_formatter::{format, format_args};
/// use ruff_formatter::prelude::*;

View File

@@ -1,6 +1,6 @@
[package]
name = "ruff_linter"
version = "0.3.6"
version = "0.4.2"
publish = false
authors = { workspace = true }
edition = { workspace = true }

View File

@@ -17,3 +17,9 @@ urllib.request.URLopener().open(fullurl='http://www.google.com')
urllib.request.URLopener().open('http://www.google.com')
urllib.request.URLopener().open('file:///foo/bar/baz')
urllib.request.URLopener().open(url)
urllib.request.urlopen(url=urllib.request.Request('http://www.google.com'))
urllib.request.urlopen(url=urllib.request.Request('http://www.google.com'), **kwargs)
urllib.request.urlopen(urllib.request.Request('http://www.google.com'))
urllib.request.urlopen(urllib.request.Request('file:///foo/bar/baz'))
urllib.request.urlopen(urllib.request.Request(url))

View File

@@ -124,3 +124,8 @@ try:
pass
except Exception:
error("...", exc_info=True)
try:
...
except Exception as e:
raise ValueError from e

View File

@@ -6,6 +6,25 @@ def this_is_a_bug():
print("Ooh, callable! Or is it?")
def still_a_bug():
import builtins
o = object()
if builtins.hasattr(o, "__call__"):
print("B U G")
if builtins.getattr(o, "__call__", False):
print("B U G")
def trickier_fix_for_this_one():
o = object()
def callable(x):
return True
if hasattr(o, "__call__"):
print("STILL a bug!")
def this_is_fine():
o = object()
if callable(o):

View File

@@ -0,0 +1,20 @@
def foo(a: list = []):
raise NotImplementedError("")
def bar(a: dict = {}):
""" This one also has a docstring"""
raise NotImplementedError("and has some text in here")
def baz(a: list = []):
"""This one raises a different exception"""
raise IndexError()
def qux(a: list = []):
raise NotImplementedError
def quux(a: list = []):
raise NotImplemented

View File

@@ -23,3 +23,15 @@ def okay(data: custom.ImmutableTypeA = foo()):
def error_due_to_missing_import(data: List[str] = Depends(None)):
...
class Class:
pass
def okay(obj=Class()):
...
def error(obj=OtherClass()):
...

View File

@@ -64,3 +64,6 @@ setattr(*foo, "bar", None)
# Regression test for: https://github.com/astral-sh/ruff/issues/7455#issuecomment-1739800901
getattr(self.
registration.registry, '__name__')
import builtins
builtins.getattr(foo, "bar")

View File

@@ -23,3 +23,7 @@ zip([1, 2, 3], repeat(1, times=None))
# Errors (limited iterators).
zip([1, 2, 3], repeat(1, 1))
zip([1, 2, 3], repeat(1, times=4))
import builtins
# Still an error even though it uses the qualified name
builtins.zip([1, 2, 3])

View File

@@ -0,0 +1,160 @@
"""
Should emit:
B909 - on lines 11, 25, 26, 40, 46
"""
# lists
some_list = [1, 2, 3]
some_other_list = [1, 2, 3]
for elem in some_list:
# errors
some_list.remove(0)
del some_list[2]
some_list.append(elem)
some_list.sort()
some_list.reverse()
some_list.clear()
some_list.extend([1, 2])
some_list.insert(1, 1)
some_list.pop(1)
some_list.pop()
# conditional break should error
if elem == 2:
some_list.remove(0)
if elem == 3:
break
# non-errors
some_other_list.remove(elem)
del some_list
del some_other_list
found_idx = some_list.index(elem)
some_list = 3
# unconditional break should not error
if elem == 2:
some_list.remove(elem)
break
# dicts
mydicts = {"a": {"foo": 1, "bar": 2}}
for elem in mydicts:
# errors
mydicts.popitem()
mydicts.setdefault("foo", 1)
mydicts.update({"foo": "bar"})
# no errors
elem.popitem()
elem.setdefault("foo", 1)
elem.update({"foo": "bar"})
# sets
myset = {1, 2, 3}
for _ in myset:
# errors
myset.update({4, 5})
myset.intersection_update({4, 5})
myset.difference_update({4, 5})
myset.symmetric_difference_update({4, 5})
myset.add(4)
myset.discard(3)
# no errors
del myset
# members
class A:
some_list: list
def __init__(self, ls):
self.some_list = list(ls)
a = A((1, 2, 3))
# ensure member accesses are handled as errors
for elem in a.some_list:
a.some_list.remove(0)
del a.some_list[2]
# Augassign should error
foo = [1, 2, 3]
bar = [4, 5, 6]
for _ in foo:
foo *= 2
foo += bar
foo[1] = 9
foo[1:2] = bar
foo[1:2:3] = bar
foo = {1, 2, 3}
bar = {4, 5, 6}
for _ in foo: # should error
foo |= bar
foo &= bar
foo -= bar
foo ^= bar
# more tests for unconditional breaks - should not error
for _ in foo:
foo.remove(1)
for _ in bar:
bar.remove(1)
break
break
# should not error
for _ in foo:
foo.remove(1)
for _ in bar:
...
break
# should error (?)
for _ in foo:
foo.remove(1)
if bar:
bar.remove(1)
break
break
# should error
for _ in foo:
if bar:
pass
else:
foo.remove(1)
# should error
for elem in some_list:
if some_list.pop() == 2:
pass
# should not error
for elem in some_list:
if some_list.pop() == 2:
break
# should error
for elem in some_list:
if some_list.pop() == 2:
pass
else:
break
# should not error
for elem in some_list:
del some_list[elem]
some_list[elem] = 1
some_list.remove(elem)
some_list.discard(elem)

View File

@@ -7,6 +7,9 @@ logging.log(logging.INFO, f"Hello {name}")
_LOGGER = logging.getLogger()
_LOGGER.info(f"{__name__}")
logging.getLogger().info(f"{name}")
from logging import info
info(f"{name}")
info(f"{__name__}")

View File

@@ -1,6 +1,9 @@
# PIE808
range(0, 10)
import builtins
builtins.range(0, 10)
# OK
range(x, 10)
range(-15, 10)

View File

@@ -14,6 +14,15 @@ IntOrStr: TypeAlias = int | str
IntOrFloat: Foo = int | float
AliasNone: typing.TypeAlias = None
# these are ok
class NotAnEnum:
NOT_A_STUB_SO_THIS_IS_FINE = None
from enum import Enum
class FooEnum(Enum): ...
class BarEnum(FooEnum):
BAR = None
VarAlias = str
AliasFoo = Foo

View File

@@ -13,6 +13,16 @@ IntOrStr: TypeAlias = int | str
IntOrFloat: Foo = int | float
AliasNone: typing.TypeAlias = None
class NotAnEnum:
FLAG_THIS = None
# these are ok
from enum import Enum
class FooEnum(Enum): ...
class BarEnum(FooEnum):
BAR = None
VarAlias = str
AliasFoo = Foo

View File

@@ -3,7 +3,7 @@ import types
import typing
from collections.abc import Awaitable
from types import TracebackType
from typing import Any, Type
from typing import Any, Type, overload
import _typeshed
import typing_extensions
@@ -73,3 +73,97 @@ class BadFive:
class BadSix:
def __exit__(self, typ, exc, tb, weird_extra_arg, extra_arg2 = None) -> None: ... # PYI036: Extra arg must have default
async def __aexit__(self, typ, exc, tb, *, weird_extra_arg) -> None: ... # PYI036: kwargs must have default
class AllPositionalOnlyArgs:
def __exit__(self, typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None, /) -> None: ...
async def __aexit__(self, typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None, /) -> None: ...
class BadAllPositionalOnlyArgs:
def __exit__(self, typ: type[Exception] | None, exc: BaseException | None, tb: TracebackType | None, /) -> None: ...
async def __aexit__(self, typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType, /) -> None: ...
# Definitions not in a class scope can do whatever, we don't care
def __exit__(self, *args: bool) -> None: ...
async def __aexit__(self, *, go_crazy: bytes) -> list[str]: ...
# Here come the overloads...
class AcceptableOverload1:
@overload
def __exit__(self, exc_typ: None, exc: None, exc_tb: None) -> None: ...
@overload
def __exit__(self, exc_typ: type[BaseException], exc: BaseException, exc_tb: TracebackType) -> None: ...
def __exit__(self, exc_typ: type[BaseException] | None, exc: BaseException | None, exc_tb: TracebackType | None) -> None: ...
# Using `object` or `Unused` in an overload definition is kinda strange,
# but let's allow it to be on the safe side
class AcceptableOverload2:
@overload
def __exit__(self, exc_typ: None, exc: None, exc_tb: object) -> None: ...
@overload
def __exit__(self, exc_typ: Unused, exc: BaseException, exc_tb: object) -> None: ...
def __exit__(self, exc_typ: type[BaseException] | None, exc: BaseException | None, exc_tb: TracebackType | None) -> None: ...
class AcceptableOverload3:
# Just ignore any overloads that don't have exactly 3 annotated non-self parameters.
# We don't have the ability (yet) to do arbitrary checking
# of whether one function definition is a subtype of another...
@overload
def __exit__(self, exc_typ: bool, exc: bool, exc_tb: bool, weird_extra_arg: bool) -> None: ...
@overload
def __exit__(self, *args: object) -> None: ...
def __exit__(self, *args: object) -> None: ...
@overload
async def __aexit__(self, exc_typ: bool, /, exc: bool, exc_tb: bool, *, keyword_only: str) -> None: ...
@overload
async def __aexit__(self, *args: object) -> None: ...
async def __aexit__(self, *args: object) -> None: ...
class AcceptableOverload4:
# Same as above
@overload
def __exit__(self, exc_typ: type[Exception], exc: type[Exception], exc_tb: types.TracebackType) -> None: ...
@overload
def __exit__(self, *args: object) -> None: ...
def __exit__(self, *args: object) -> None: ...
@overload
async def __aexit__(self, exc_typ: type[Exception], exc: type[Exception], exc_tb: types.TracebackType, *, extra: str = "foo") -> None: ...
@overload
async def __aexit__(self, exc_typ: None, exc: None, tb: None) -> None: ...
async def __aexit__(self, *args: object) -> None: ...
class StrangeNumberOfOverloads:
# Only one overload? Type checkers will emit an error, but we should just ignore it
@overload
def __exit__(self, exc_typ: bool, exc: bool, tb: bool) -> None: ...
def __exit__(self, exc_typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None) -> None: ...
# More than two overloads? Anything could be going on; again, just ignore all the overloads
@overload
async def __aexit__(self, arg: bool) -> None: ...
@overload
async def __aexit__(self, arg: None, arg2: None, arg3: None) -> None: ...
@overload
async def __aexit__(self, arg: bool, arg2: bool, arg3: bool) -> None: ...
async def __aexit__(self, exc_typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None) -> None: ...
# TODO: maybe we should emit an error on this one as well?
class BizarreAsyncSyncOverloadMismatch:
@overload
def __exit__(self, exc_typ: bool, exc: bool, tb: bool) -> None: ...
@overload
async def __exit__(self, exc_typ: bool, exc: bool, tb: bool) -> None: ...
def __exit__(self, *args: object) -> None: ...
class UnacceptableOverload1:
@overload
def __exit__(self, exc_typ: None, exc: None, tb: None) -> None: ... # Okay
@overload
def __exit__(self, exc_typ: Exception, exc: Exception, tb: TracebackType) -> None: ... # PYI036
def __exit__(self, exc_typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None) -> None: ...
class UnacceptableOverload2:
@overload
def __exit__(self, exc_typ: type[BaseException] | None, exc: None, tb: None) -> None: ... # PYI036
@overload
def __exit__(self, exc_typ: object, exc: Exception, tb: builtins.TracebackType) -> None: ... # PYI036
def __exit__(self, exc_typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None) -> None: ...

View File

@@ -3,7 +3,7 @@ import types
import typing
from collections.abc import Awaitable
from types import TracebackType
from typing import Any, Type
from typing import Any, Type, overload
import _typeshed
import typing_extensions
@@ -73,3 +73,93 @@ class BadFive:
class BadSix:
def __exit__(self, typ, exc, tb, weird_extra_arg, extra_arg2 = None) -> None: ... # PYI036: Extra arg must have default
async def __aexit__(self, typ, exc, tb, *, weird_extra_arg) -> None: ... # PYI036: kwargs must have default
def isolated_scope():
from builtins import type as Type
class ShouldNotError:
def __exit__(self, typ: Type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None) -> None: ...
class AllPositionalOnlyArgs:
def __exit__(self, typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None, /) -> None: ...
async def __aexit__(self, typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType | None, /) -> None: ...
class BadAllPositionalOnlyArgs:
def __exit__(self, typ: type[Exception] | None, exc: BaseException | None, tb: TracebackType | None, /) -> None: ...
async def __aexit__(self, typ: type[BaseException] | None, exc: BaseException | None, tb: TracebackType, /) -> None: ...
# Definitions not in a class scope can do whatever, we don't care
def __exit__(self, *args: bool) -> None: ...
async def __aexit__(self, *, go_crazy: bytes) -> list[str]: ...
# Here come the overloads...
class AcceptableOverload1:
@overload
def __exit__(self, exc_typ: None, exc: None, exc_tb: None) -> None: ...
@overload
def __exit__(self, exc_typ: type[BaseException], exc: BaseException, exc_tb: TracebackType) -> None: ...
# Using `object` or `Unused` in an overload definition is kinda strange,
# but let's allow it to be on the safe side
class AcceptableOverload2:
@overload
def __exit__(self, exc_typ: None, exc: None, exc_tb: object) -> None: ...
@overload
def __exit__(self, exc_typ: Unused, exc: BaseException, exc_tb: object) -> None: ...
class AcceptableOverload3:
# Just ignore any overloads that don't have exactly 3 annotated non-self parameters.
# We don't have the ability (yet) to do arbitrary checking
# of whether one function definition is a subtype of another...
@overload
def __exit__(self, exc_typ: bool, exc: bool, exc_tb: bool, weird_extra_arg: bool) -> None: ...
@overload
def __exit__(self, *args: object) -> None: ...
@overload
async def __aexit__(self, exc_typ: bool, /, exc: bool, exc_tb: bool, *, keyword_only: str) -> None: ...
@overload
async def __aexit__(self, *args: object) -> None: ...
class AcceptableOverload4:
# Same as above
@overload
def __exit__(self, exc_typ: type[Exception], exc: type[Exception], exc_tb: types.TracebackType) -> None: ...
@overload
def __exit__(self, *args: object) -> None: ...
@overload
async def __aexit__(self, exc_typ: type[Exception], exc: type[Exception], exc_tb: types.TracebackType, *, extra: str = "foo") -> None: ...
@overload
async def __aexit__(self, exc_typ: None, exc: None, tb: None) -> None: ...
class StrangeNumberOfOverloads:
# Only one overload? Type checkers will emit an error, but we should just ignore it
@overload
def __exit__(self, exc_typ: bool, exc: bool, tb: bool) -> None: ...
# More than two overloads? Anything could be going on; again, just ignore all the overloads
@overload
async def __aexit__(self, arg: bool) -> None: ...
@overload
async def __aexit__(self, arg: None, arg2: None, arg3: None) -> None: ...
@overload
async def __aexit__(self, arg: bool, arg2: bool, arg3: bool) -> None: ...
# TODO: maybe we should emit an error on this one as well?
class BizarreAsyncSyncOverloadMismatch:
@overload
def __exit__(self, exc_typ: bool, exc: bool, tb: bool) -> None: ...
@overload
async def __exit__(self, exc_typ: bool, exc: bool, tb: bool) -> None: ...
class UnacceptableOverload1:
@overload
def __exit__(self, exc_typ: None, exc: None, tb: None) -> None: ... # Okay
@overload
def __exit__(self, exc_typ: Exception, exc: Exception, tb: TracebackType) -> None: ... # PYI036
class UnacceptableOverload2:
@overload
def __exit__(self, exc_typ: type[BaseException] | None, exc: None, tb: None) -> None: ... # PYI036
@overload
def __exit__(self, exc_typ: object, exc: Exception, tb: builtins.TracebackType) -> None: ... # PYI036

View File

@@ -58,3 +58,8 @@ for key in (
.keys()
):
continue
from builtins import dict as SneakyDict
d = SneakyDict()
key in d.keys() # SIM118

View File

@@ -21,3 +21,5 @@ for k, v in zip(d2.keys(), d2.values()): # SIM911
...
items = zip(x.keys(), x.values()) # OK
items.bar = zip(x.keys(), x.values()) # OK

View File

@@ -11,3 +11,11 @@ from enum import Enum
class Fine(str, Enum): # Ok
__slots__ = ["foo"]
class SubEnum(Enum):
pass
class Ok(str, SubEnum): # Ok
pass

View File

@@ -19,3 +19,9 @@ class Bad(Tuple[str, int, float]): # SLOT001
class Good(Tuple[str, int, float]): # OK
__slots__ = ("foo",)
import builtins
class AlsoBad(builtins.tuple[int, int]): # SLOT001
pass

View File

@@ -80,3 +80,8 @@ for i in list(foo_list): # OK
for i in list(foo_list): # OK
if True:
del foo_list[i + 1]
import builtins
for i in builtins.list(nested_tuple): # PERF101
pass

View File

@@ -81,3 +81,11 @@ def foo():
result = {}
for idx, name in enumerate(fruit):
result[name] = idx # PERF403
def foo():
from builtins import dict as SneakyDict
fruit = ["apple", "pear", "orange"]
result = SneakyDict()
for idx, name in enumerate(fruit):
result[name] = idx # PERF403

View File

@@ -445,6 +445,39 @@ def test():
# end
# no error
class Foo:
"""Demo."""
@overload
def bar(self, x: int) -> int: ...
@overload
def bar(self, x: str) -> str: ...
def bar(self, x: int | str) -> int | str:
return x
# end
# no error
@overload
def foo(x: int) -> int: ...
@overload
def foo(x: str) -> str: ...
def foo(x: int | str) -> int | str:
if not isinstance(x, (int, str)):
raise TypeError
return x
# end
# no error
def foo(self, x: int) -> int: ...
def bar(self, x: str) -> str: ...
def baz(self, x: int | str) -> int | str:
return x
# end
# E301
class Class(object):
@@ -489,6 +522,20 @@ class Class:
# end
# E301
class Foo:
"""Demo."""
@overload
def bar(self, x: int) -> int: ...
@overload
def bar(self, x: str) -> str:
...
def bar(self, x: int | str) -> int | str:
return x
# end
# E302
"""Main module."""
def fn():
@@ -580,6 +627,23 @@ class Test:
# end
# E302
class A:...
class B: ...
# end
# E302
@overload
def fn(a: int) -> int: ...
@overload
def fn(a: str) -> str: ...
def fn(a: int | str) -> int | str:
...
# end
# E303
def fn():
_ = None

View File

@@ -138,3 +138,8 @@ np.dtype(int) == float
#: E721
dtype == float
import builtins
if builtins.type(res) == memoryview: # E721
pass

View File

@@ -53,7 +53,7 @@ class WithinBody[T](list[T]): # OK
def foo(self, x: T) -> T: # OK
return x
def foo(self):
T # OK
@@ -76,6 +76,16 @@ type Foo[T: (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNot
def foo[T: (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
class Foo[T: (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
# Same in defaults
type Foo[T = DoesNotExist] = T # F821: Undefined name `DoesNotExist`
def foo[T = DoesNotExist](t: T) -> T: return t # F821: Undefined name `DoesNotExist`
class Foo[T = DoesNotExist](list[T]): ... # F821: Undefined name `DoesNotExist`
type Foo[T = (DoesNotExist1, DoesNotExist2)] = T # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
def foo[T = (DoesNotExist1, DoesNotExist2)](t: T) -> T: return t # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
class Foo[T = (DoesNotExist1, DoesNotExist2)](list[T]): ... # F821: Undefined name `DoesNotExist1`, Undefined name `DoesNotExist2`
# Type parameters in nested classes
class Parent[T]:
@@ -83,21 +93,21 @@ class Parent[T]:
def can_use_class_variable(self, x: t) -> t: # OK
return x
class Child:
def can_access_parent_type_parameter(self, x: T) -> T: # OK
T # OK
return x
def cannot_access_parent_variable(self, x: t) -> t: # F821: Undefined name `T`
t # F821: Undefined name `t`
return x
# Type parameters in nested functions
def can_access_inside_nested[T](t: T) -> T: # OK
def bar(x: T) -> T: # OK
T # OK
return x
bar(t)

View File

@@ -4,3 +4,8 @@ def f() -> None:
def g() -> None:
raise NotImplemented
def h() -> None:
NotImplementedError = "foo"
raise NotImplemented

View File

@@ -0,0 +1 @@
#noqa

View File

@@ -32,3 +32,6 @@ pathlib.Path(NAME).open(mode)
pathlib.Path(NAME).open("rwx") # [bad-open-mode]
pathlib.Path(NAME).open(mode="rwx") # [bad-open-mode]
pathlib.Path(NAME).open("rwx", encoding="utf-8") # [bad-open-mode]
import builtins
builtins.open(NAME, "Ua", encoding="utf-8")

View File

@@ -134,3 +134,28 @@ if value.attr > 3:
value.
attr
) = 3
class Foo:
_min = 0
_max = 0
def foo(self, value) -> None:
if value < self._min:
self._min = value
if value > self._max:
self._max = value
if self._min < value:
self._min = value
if self._max > value:
self._max = value
if value <= self._min:
self._min = value
if value >= self._max:
self._max = value
if self._min <= value:
self._min = value
if self._max >= value:
self._max = value

View File

@@ -0,0 +1,65 @@
# These testcases should raise errors
class Float:
def __bytes__(self):
return 3.05 # [invalid-bytes-return]
class Int:
def __bytes__(self):
return 0 # [invalid-bytes-return]
class Str:
def __bytes__(self):
return "some bytes" # [invalid-bytes-return]
class BytesNoReturn:
def __bytes__(self):
print("ruff") # [invalid-bytes-return]
# TODO: Once Ruff has better type checking
def return_bytes():
return "some string"
class ComplexReturn:
def __bytes__(self):
return return_bytes() # [invalid-bytes-return]
# These testcases should NOT raise errors
class Bytes:
def __bytes__(self):
return b"some bytes"
class Bytes2:
def __bytes__(self):
x = b"some bytes"
return x
class Bytes3:
def __bytes__(self): ...
class Bytes4:
def __bytes__(self):
pass
class Bytes5:
def __bytes__(self):
raise NotImplementedError
class Bytes6:
def __bytes__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -0,0 +1,65 @@
# These testcases should raise errors
class Bool:
def __hash__(self):
return True # [invalid-hash-return]
class Float:
def __hash__(self):
return 3.05 # [invalid-hash-return]
class Str:
def __hash__(self):
return "ruff" # [invalid-hash-return]
class HashNoReturn:
def __hash__(self):
print("ruff") # [invalid-hash-return]
# TODO: Once Ruff has better type checking
def return_int():
return "3"
class ComplexReturn:
def __hash__(self):
return return_int() # [invalid-hash-return]
# These testcases should NOT raise errors
class Hash:
def __hash__(self):
return 7741
class Hash2:
def __hash__(self):
x = 7741
return x
class Hash3:
def __hash__(self): ...
class Has4:
def __hash__(self):
pass
class Hash5:
def __hash__(self):
raise NotImplementedError
class HashWrong6:
def __hash__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -0,0 +1,73 @@
# These testcases should raise errors
class Bool:
"""pylint would not raise, but ruff does - see explanation in the docs"""
def __index__(self):
return True # [invalid-index-return]
class Float:
def __index__(self):
return 3.05 # [invalid-index-return]
class Dict:
def __index__(self):
return {"1": "1"} # [invalid-index-return]
class Str:
def __index__(self):
return "ruff" # [invalid-index-return]
class IndexNoReturn:
def __index__(self):
print("ruff") # [invalid-index-return]
# TODO: Once Ruff has better type checking
def return_index():
return "3"
class ComplexReturn:
def __index__(self):
return return_index() # [invalid-index-return]
# These testcases should NOT raise errors
class Index:
def __index__(self):
return 0
class Index2:
def __index__(self):
x = 1
return x
class Index3:
def __index__(self):
...
class Index4:
def __index__(self):
pass
class Index5:
def __index__(self):
raise NotImplementedError
class Index6:
def __index__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -0,0 +1,70 @@
# These testcases should raise errors
class Bool:
def __len__(self):
return True # [invalid-length-return]
class Float:
def __len__(self):
return 3.05 # [invalid-length-return]
class Str:
def __len__(self):
return "ruff" # [invalid-length-return]
class LengthNoReturn:
def __len__(self):
print("ruff") # [invalid-length-return]
class LengthNegative:
def __len__(self):
return -42 # [invalid-length-return]
# TODO: Once Ruff has better type checking
def return_int():
return "3"
class ComplexReturn:
def __len__(self):
return return_int() # [invalid-length-return]
# These testcases should NOT raise errors
class Length:
def __len__(self):
return 42
class Length2:
def __len__(self):
x = 42
return x
class Length3:
def __len__(self): ...
class Length4:
def __len__(self):
pass
class Length5:
def __len__(self):
raise NotImplementedError
class Length6:
def __len__(self):
print("raise some error")
raise NotImplementedError

View File

@@ -1,36 +1,60 @@
# These testcases should raise errors
class Float:
def __str__(self):
return 3.05
class Int:
def __str__(self):
return 1
class Int2:
def __str__(self):
return 0
class Bool:
def __str__(self):
return False
# TODO: Once Ruff has better type checking
def return_int():
return 3
class ComplexReturn:
def __str__(self):
return return_int()
# These testcases should NOT raise errors
class Str:
def __str__(self):
return "ruff"
class Str2:
def __str__(self):
x = "ruff"
return x
class Str3:
def __str__(self): ...
class Str4:
def __str__(self):
raise RuntimeError("__str__ not allowed")
class Str5:
def __str__(self): # PLE0307 (returns None if x <= 0)
if x > 0:
raise RuntimeError("__str__ not allowed")

View File

@@ -47,6 +47,12 @@ if y == np.nan:
if y == npy_nan:
pass
import builtins
# PLW0117
if x == builtins.float("nan"):
pass
# OK
if math.isnan(x):
pass

View File

@@ -39,3 +39,6 @@ max(max(tuples_list))
# Starred argument should be copied as it is.
max(1, max(*a))
import builtins
builtins.min(1, min(2, 3))

View File

@@ -0,0 +1,58 @@
# Errors
some_string = "some string"
index, a_number, to_multiply, to_divide, to_cube, timeDiffSeconds, flags = (
0,
1,
2,
3,
4,
5,
0x3,
)
a_list = [1, 2]
some_set = {"elem"}
mat1, mat2 = None, None
some_string = some_string + "a very long end of string"
index = index - 1
a_list = a_list + ["to concat"]
some_set = some_set | {"to concat"}
to_multiply = to_multiply * 5
to_multiply = 5 * to_multiply
to_multiply = to_multiply * to_multiply
to_divide = to_divide / 5
to_divide = to_divide // 5
to_cube = to_cube**3
to_cube = 3**to_cube
to_cube = to_cube**to_cube
timeDiffSeconds = timeDiffSeconds % 60
flags = flags & 0x1
flags = flags | 0x1
flags = flags ^ 0x1
flags = flags << 1
flags = flags >> 1
mat1 = mat1 @ mat2
a_list[1] = a_list[1] + 1
a_list[0:2] = a_list[0:2] * 3
a_list[:2] = a_list[:2] * 3
a_list[1:] = a_list[1:] * 3
a_list[:] = a_list[:] * 3
index = index * (index + 10)
class T:
def t(self):
self.a = self.a + 1
obj = T()
obj.a = obj.a + 1
# OK
a_list[0] = a_list[:] * 3
index = a_number = a_number + 1
a_number = index = a_number + 1
index = index * index + 10
some_string = "a very long start to the string" + some_string

View File

@@ -0,0 +1,43 @@
class Fruit:
@classmethod
def list_fruits(cls) -> None:
cls = "apple" # PLW0642
cls: Fruit = "apple" # PLW0642
cls += "orange" # PLW0642
*cls = "banana" # PLW0642
cls, blah = "apple", "orange" # PLW0642
blah, (cls, blah2) = "apple", ("orange", "banana") # PLW0642
blah, [cls, blah2] = "apple", ("orange", "banana") # PLW0642
@classmethod
def add_fruits(cls, fruits, /) -> None:
cls = fruits # PLW0642
def print_color(self) -> None:
self = "red" # PLW0642
self: Self = "red" # PLW0642
self += "blue" # PLW0642
*self = "blue" # PLW0642
self, blah = "red", "blue" # PLW0642
blah, (self, blah2) = "apple", ("orange", "banana") # PLW0642
blah, [self, blah2] = "apple", ("orange", "banana") # PLW0642
def print_color(self, color, /) -> None:
self = color
def ok(self) -> None:
cls = None # OK because the rule looks for the name in the signature
@classmethod
def ok(cls) -> None:
self = None
@staticmethod
def list_fruits_static(self, cls) -> None:
self = "apple" # Ok
cls = "banana" # Ok
def list_fruits(self, cls) -> None:
self = "apple" # Ok
cls = "banana" # Ok

View File

@@ -152,3 +152,9 @@ object = A
class B(object):
...
import builtins
class Unusual(builtins.object):
...

View File

@@ -117,3 +117,10 @@ path = "%s-%s-%s.pem" % (
cert.not_valid_after.date().isoformat().replace("-", ""), # expiration date
hexlify(cert.fingerprint(hashes.SHA256())).decode("ascii")[0:8], # fingerprint prefix
)
# UP031 (no longer false negatives; now offer potentially unsafe fixes)
'Hello %s' % bar
'Hello %s' % bar.baz
'Hello %s' % bar['bop']

View File

@@ -32,10 +32,3 @@ pytest.param('"%8s" % (None,)', id="unsafe width-string conversion"),
"%(1)s" % {1: 2, "1": 2}
"%(and)s" % {"and": 2}
# OK (arguably false negatives)
'Hello %s' % bar
'Hello %s' % bar.baz
'Hello %s' % bar['bop']

View File

@@ -59,6 +59,21 @@ with open("file.txt", "w", newline="\r\n") as f:
f.write(foobar)
import builtins
# FURB103
with builtins.open("file.txt", "w", newline="\r\n") as f:
f.write(foobar)
from builtins import open as o
# FURB103
with o("file.txt", "w", newline="\r\n") as f:
f.write(foobar)
# Non-errors.
with open("file.txt", errors="ignore", mode="wb") as f:

View File

@@ -0,0 +1,19 @@
num = 1337
def return_num() -> int:
return num
print(oct(num)[2:]) # FURB116
print(hex(num)[2:]) # FURB116
print(bin(num)[2:]) # FURB116
print(oct(1337)[2:]) # FURB116
print(hex(1337)[2:]) # FURB116
print(bin(1337)[2:]) # FURB116
print(bin(return_num())[2:]) # FURB116 (no autofix)
print(bin(int(f"{num}"))[2:]) # FURB116 (no autofix)
## invalid
print(oct(0o1337)[1:])
print(hex(0x1337)[3:])

View File

@@ -10,7 +10,7 @@ op_mult = lambda x, y: x * y
op_matmutl = lambda x, y: x @ y
op_truediv = lambda x, y: x / y
op_mod = lambda x, y: x % y
op_pow = lambda x, y: x**y
op_pow = lambda x, y: x ** y
op_lshift = lambda x, y: x << y
op_rshift = lambda x, y: x >> y
op_bitor = lambda x, y: x | y
@@ -59,6 +59,7 @@ op_itemgetter = lambda x, y: (x[0], y[0])
op_itemgetter = lambda x, y: (x[0], y[0])
op_itemgetter = lambda x: ()
op_itemgetter = lambda x: (*x[0], x[1])
op_itemgetter = lambda x: (x[0],)
def op_neg3(x, y):

View File

@@ -41,6 +41,22 @@ def func():
pass
import builtins
with builtins.open("FURB129.py") as f:
for line in f.readlines():
pass
from builtins import open as o
with o("FURB129.py") as f:
for line in f.readlines():
pass
# False positives
def func(f):
for _line in f.readlines():

View File

@@ -49,6 +49,14 @@ def yes_five(x: Dict[int, str]):
x = 1
from builtins import list as SneakyList
sneaky = SneakyList()
# FURB131
del sneaky[:]
# these should not
del names["key"]

View File

@@ -0,0 +1,48 @@
# Errors
sorted(l)[0]
sorted(l)[-1]
sorted(l, reverse=False)[-1]
sorted(l, key=lambda x: x)[0]
sorted(l, key=key_fn)[0]
sorted([1, 2, 3])[0]
# Unsafe
sorted(l, key=key_fn, reverse=True)[-1]
sorted(l, reverse=True)[0]
sorted(l, reverse=True)[-1]
# Non-errors
sorted(l, reverse=foo())[0]
sorted(l)[1]
sorted(get_list())[1]
sorted()[0]
sorted(l)[1]
sorted(l)[-2]
b = True
sorted(l, reverse=b)[0]
sorted(l, invalid_kwarg=True)[0]
def sorted():
pass
sorted(l)[0]

View File

@@ -12,6 +12,8 @@ dict.fromkeys(pierogi_fillings, {})
dict.fromkeys(pierogi_fillings, set())
dict.fromkeys(pierogi_fillings, {"pre": "populated!"})
dict.fromkeys(pierogi_fillings, dict())
import builtins
builtins.dict.fromkeys(pierogi_fillings, dict())
# Okay.
dict.fromkeys(pierogi_fillings)

View File

@@ -0,0 +1,75 @@
import time
import asyncio
async def pass_1a(): # OK: awaits a coroutine
await asyncio.sleep(1)
async def pass_1b(): # OK: awaits a coroutine
def foo(optional_arg=await bar()):
pass
async def pass_2(): # OK: uses an async context manager
async with None as i:
pass
async def pass_3(): # OK: uses an async loop
async for i in []:
pass
class Foo:
async def pass_4(self): # OK: method of a class
pass
def foo():
async def pass_5(): # OK: uses an await
await bla
async def pass_6(): # OK: just a stub
...
async def fail_1a(): # RUF029
time.sleep(1)
async def fail_1b(): # RUF029: yield does not require async
yield "hello"
async def fail_2(): # RUF029
with None as i:
pass
async def fail_3(): # RUF029
for i in []:
pass
return foo
async def fail_4a(): # RUF029: the /outer/ function does not await
async def foo():
await bla
async def fail_4b(): # RUF029: the /outer/ function does not await
class Foo:
async def foo(self):
await bla
def foo():
async def fail_4c(): # RUF029: the /inner/ function does not await
pass
async def test():
return [check async for check in async_func()]

View File

@@ -26,10 +26,10 @@ def f() -> None:
# fmt: off
# Invalid - no space before #
d = 1# noqa: E501
d = 1 # noqa: E501
# Invalid - many spaces before #
d = 1 # noqa: E501
d = 1 # noqa: E501
# fmt: on
@@ -104,5 +104,28 @@ def f():
def f():
# Invalid - nonexistant error code with multibyte character
d = 1 #…noqa: F841, E50
e = 1 #…noqa: E50
d = 1 # …noqa: F841, E50
e = 1 # …noqa: E50
def f():
# Disabled - check redirects are reported correctly
eval(command) # noqa: PGH001
# Check duplicate code detection
def f():
x = 2 # noqa: F841, F841, X200
y = 2 == bar # noqa: SIM300, F841, SIM300, SIM300
z = 2 # noqa: F841 F841 F841, F841, F841
return
# Allow code redirects
x = eval(command) # noqa: PGH001, S307
x = eval(command) # noqa: S307, PGH001, S307, S307, S307
x = eval(command) # noqa: PGH001, S307, PGH001
x = eval(command) # noqa: PGH001, S307, PGH001, S307

View File

@@ -0,0 +1,6 @@
x = 2 # noqa: RUF940
x = 2 # noqa: RUF950
x = 2 # noqa: RUF940, RUF950
x = 2 # noqa: RUF950, RUF940, RUF950, RUF950, RUF950
x = 2 # noqa: RUF940, RUF950, RUF940
x = 2 # noqa: RUF940, RUF950, RUF940, RUF950

View File

@@ -1,2 +1,3 @@
import os
import foo # noqa: F401
import bar # noqa

View File

@@ -30,6 +30,9 @@ pub(crate) fn deferred_for_loops(checker: &mut Checker) {
if checker.enabled(Rule::EnumerateForLoop) {
flake8_simplify::rules::enumerate_for_loop(checker, stmt_for);
}
if checker.enabled(Rule::LoopIteratorMutation) {
flake8_bugbear::rules::loop_iterator_mutation(checker, stmt_for);
}
}
}
}

View File

@@ -125,6 +125,12 @@ pub(crate) fn expression(expr: &Expr, checker: &mut Checker) {
if checker.enabled(Rule::PotentialIndexError) {
pylint::rules::potential_index_error(checker, value, slice);
}
if checker.enabled(Rule::SortedMinMax) {
refurb::rules::sorted_min_max(checker, subscript);
}
if checker.enabled(Rule::FStringNumberFormat) {
refurb::rules::fstring_number_format(checker, subscript);
}
pandas_vet::rules::subscript(checker, value, expr);
}
@@ -247,7 +253,7 @@ pub(crate) fn expression(expr: &Expr, checker: &mut Checker) {
}
}
}
ExprContext::Del => {}
_ => {}
}
if checker.enabled(Rule::SixPY3) {
flake8_2020::rules::name_or_attribute(checker, expr);

View File

@@ -95,10 +95,22 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
}
}
if checker.enabled(Rule::InvalidBoolReturnType) {
pylint::rules::invalid_bool_return(checker, name, body);
pylint::rules::invalid_bool_return(checker, function_def);
}
if checker.enabled(Rule::InvalidLengthReturnType) {
pylint::rules::invalid_length_return(checker, function_def);
}
if checker.enabled(Rule::InvalidBytesReturnType) {
pylint::rules::invalid_bytes_return(checker, function_def);
}
if checker.enabled(Rule::InvalidIndexReturnType) {
pylint::rules::invalid_index_return(checker, function_def);
}
if checker.enabled(Rule::InvalidHashReturnType) {
pylint::rules::invalid_hash_return(checker, function_def);
}
if checker.enabled(Rule::InvalidStrReturnType) {
pylint::rules::invalid_str_return(checker, name, body);
pylint::rules::invalid_str_return(checker, function_def);
}
if checker.enabled(Rule::InvalidFunctionName) {
if let Some(diagnostic) = pep8_naming::rules::invalid_function_name(
@@ -146,7 +158,7 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
decorator_list,
returns.as_ref().map(AsRef::as_ref),
parameters,
type_params.as_ref(),
type_params.as_deref(),
);
}
if checker.source_type.is_stub() {
@@ -162,7 +174,7 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
}
}
if checker.enabled(Rule::BadExitAnnotation) {
flake8_pyi::rules::bad_exit_annotation(checker, *is_async, name, parameters);
flake8_pyi::rules::bad_exit_annotation(checker, function_def);
}
if checker.enabled(Rule::RedundantNumericUnion) {
flake8_pyi::rules::redundant_numeric_union(checker, parameters);
@@ -348,6 +360,9 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
if checker.enabled(Rule::SslWithBadDefaults) {
flake8_bandit::rules::ssl_with_bad_defaults(checker, function_def);
}
if checker.enabled(Rule::UnusedAsync) {
ruff::rules::unused_async(checker, function_def);
}
}
Stmt::Return(_) => {
if checker.enabled(Rule::ReturnOutsideFunction) {
@@ -606,7 +621,8 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
pylint::rules::manual_from_import(checker, stmt, alias, names);
}
if checker.enabled(Rule::ImportSelf) {
if let Some(diagnostic) = pylint::rules::import_self(alias, checker.module_path)
if let Some(diagnostic) =
pylint::rules::import_self(alias, checker.module.qualified_name())
{
checker.diagnostics.push(diagnostic);
}
@@ -770,9 +786,11 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
flake8_bandit::rules::suspicious_imports(checker, stmt);
}
if checker.enabled(Rule::BannedApi) {
if let Some(module) =
helpers::resolve_imported_module_path(level, module, checker.module_path)
{
if let Some(module) = helpers::resolve_imported_module_path(
level,
module,
checker.module.qualified_name(),
) {
flake8_tidy_imports::rules::banned_api(
checker,
&flake8_tidy_imports::matchers::NameMatchPolicy::MatchNameOrParent(
@@ -799,9 +817,11 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
}
}
if checker.enabled(Rule::BannedModuleLevelImports) {
if let Some(module) =
helpers::resolve_imported_module_path(level, module, checker.module_path)
{
if let Some(module) = helpers::resolve_imported_module_path(
level,
module,
checker.module.qualified_name(),
) {
flake8_tidy_imports::rules::banned_module_level_imports(
checker,
&flake8_tidy_imports::matchers::NameMatchPolicy::MatchNameOrParent(
@@ -888,7 +908,7 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
stmt,
level,
module,
checker.module_path,
checker.module.qualified_name(),
checker.settings.flake8_tidy_imports.ban_relative_imports,
) {
checker.diagnostics.push(diagnostic);
@@ -987,9 +1007,12 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
}
}
if checker.enabled(Rule::ImportSelf) {
if let Some(diagnostic) =
pylint::rules::import_from_self(level, module, names, checker.module_path)
{
if let Some(diagnostic) = pylint::rules::import_from_self(
level,
module,
names,
checker.module.qualified_name(),
) {
checker.diagnostics.push(diagnostic);
}
}
@@ -1055,6 +1078,9 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
}
}
Stmt::AugAssign(aug_assign @ ast::StmtAugAssign { target, .. }) => {
if checker.enabled(Rule::SelfOrClsAssignment) {
pylint::rules::self_or_cls_assignment(checker, target);
}
if checker.enabled(Rule::GlobalStatement) {
if let Expr::Name(ast::ExprName { id, .. }) = target.as_ref() {
pylint::rules::global_statement(checker, id);
@@ -1268,6 +1294,7 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
if checker.any_enabled(&[
Rule::EnumerateForLoop,
Rule::IncorrectDictIterator,
Rule::LoopIteratorMutation,
Rule::UnnecessaryEnumerate,
Rule::UnusedLoopControlVariable,
Rule::YieldInForLoop,
@@ -1409,6 +1436,11 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
}
}
Stmt::Assign(assign @ ast::StmtAssign { targets, value, .. }) => {
if checker.enabled(Rule::SelfOrClsAssignment) {
for target in targets {
pylint::rules::self_or_cls_assignment(checker, target);
}
}
if checker.enabled(Rule::RedeclaredAssignedName) {
pylint::rules::redeclared_assigned_name(checker, targets);
}
@@ -1478,6 +1510,9 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
if checker.settings.rules.enabled(Rule::TypeBivariance) {
pylint::rules::type_bivariance(checker, value);
}
if checker.enabled(Rule::NonAugmentedAssignment) {
pylint::rules::non_augmented_assignment(checker, assign);
}
if checker.settings.rules.enabled(Rule::UnsortedDunderAll) {
ruff::rules::sort_dunder_all_assign(checker, assign);
}
@@ -1543,6 +1578,9 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
);
}
}
if checker.enabled(Rule::SelfOrClsAssignment) {
pylint::rules::self_or_cls_assignment(checker, target);
}
if checker.enabled(Rule::SelfAssigningVariable) {
pylint::rules::self_annotated_assignment(checker, assign_stmt);
}

View File

@@ -33,4 +33,7 @@ pub(crate) fn string_like(string_like: StringLike, checker: &mut Checker) {
if checker.enabled(Rule::UnnecessaryEscapedQuote) {
flake8_quotes::rules::unnecessary_escaped_quote(checker, string_like);
}
if checker.enabled(Rule::AvoidableEscapedQuote) && checker.settings.flake8_quotes.avoid_escape {
flake8_quotes::rules::avoidable_escaped_quote(checker, string_like);
}
}

View File

@@ -31,8 +31,8 @@ use std::path::Path;
use itertools::Itertools;
use log::debug;
use ruff_python_ast::{
self as ast, Comprehension, ElifElseClause, ExceptHandler, Expr, ExprContext, FStringElement,
Keyword, MatchCase, Parameter, ParameterWithDefault, Parameters, Pattern, Stmt, Suite, UnaryOp,
self as ast, AnyParameterRef, Comprehension, ElifElseClause, ExceptHandler, Expr, ExprContext,
FStringElement, Keyword, MatchCase, Parameter, Parameters, Pattern, Stmt, Suite, UnaryOp,
};
use ruff_text_size::{Ranged, TextRange, TextSize};
@@ -50,11 +50,11 @@ use ruff_python_ast::{helpers, str, visitor, PySourceType};
use ruff_python_codegen::{Generator, Stylist};
use ruff_python_index::Indexer;
use ruff_python_parser::typing::{parse_type_annotation, AnnotationKind};
use ruff_python_semantic::analyze::{imports, typing, visibility};
use ruff_python_semantic::analyze::{imports, typing};
use ruff_python_semantic::{
BindingFlags, BindingId, BindingKind, Exceptions, Export, FromImport, Globals, Import, Module,
ModuleKind, NodeId, ScopeId, ScopeKind, SemanticModel, SemanticModelFlags, StarImport,
SubmoduleImport,
ModuleKind, ModuleSource, NodeId, ScopeId, ScopeKind, SemanticModel, SemanticModelFlags,
StarImport, SubmoduleImport,
};
use ruff_python_stdlib::builtins::{IPYTHON_BUILTINS, MAGIC_GLOBALS, PYTHON_BUILTINS};
use ruff_source_file::{Locator, OneIndexed, SourceRow};
@@ -110,7 +110,7 @@ pub(crate) struct Checker<'a> {
/// The [`Path`] to the package containing the current file.
package: Option<&'a Path>,
/// The module representation of the current file (e.g., `foo.bar`).
module_path: Option<&'a [String]>,
module: Module<'a>,
/// The [`PySourceType`] of the current file.
pub(crate) source_type: PySourceType,
/// The [`CellOffsets`] for the current file, if it's a Jupyter notebook.
@@ -174,7 +174,7 @@ impl<'a> Checker<'a> {
noqa,
path,
package,
module_path: module.path(),
module,
source_type,
locator,
stylist,
@@ -464,7 +464,7 @@ impl<'a> Visitor<'a> for Checker<'a> {
let level = *level;
// Mark the top-level module as "seen" by the semantic model.
if level.map_or(true, |level| level == 0) {
if level == 0 {
if let Some(module) = module.and_then(|module| module.split('.').next()) {
self.semantic.add_module(module);
}
@@ -604,15 +604,11 @@ impl<'a> Visitor<'a> for Checker<'a> {
self.visit_type_params(type_params);
}
for parameter_with_default in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
{
if let Some(expr) = &parameter_with_default.parameter.annotation {
if singledispatch {
for parameter in &**parameters {
if let Some(expr) = parameter.annotation() {
if singledispatch && !parameter.is_variadic() {
self.visit_runtime_required_annotation(expr);
singledispatch = false;
} else {
match annotation {
AnnotationContext::RuntimeRequired => {
@@ -625,42 +621,11 @@ impl<'a> Visitor<'a> for Checker<'a> {
self.visit_annotation(expr);
}
}
};
}
}
if let Some(expr) = &parameter_with_default.default {
if let Some(expr) = parameter.default() {
self.visit_expr(expr);
}
singledispatch = false;
}
if let Some(arg) = &parameters.vararg {
if let Some(expr) = &arg.annotation {
match annotation {
AnnotationContext::RuntimeRequired => {
self.visit_runtime_required_annotation(expr);
}
AnnotationContext::RuntimeEvaluated => {
self.visit_runtime_evaluated_annotation(expr);
}
AnnotationContext::TypingOnly => {
self.visit_annotation(expr);
}
}
}
}
if let Some(arg) = &parameters.kwarg {
if let Some(expr) = &arg.annotation {
match annotation {
AnnotationContext::RuntimeRequired => {
self.visit_runtime_required_annotation(expr);
}
AnnotationContext::RuntimeEvaluated => {
self.visit_runtime_evaluated_annotation(expr);
}
AnnotationContext::TypingOnly => {
self.visit_annotation(expr);
}
}
}
}
for expr in returns {
match annotation {
@@ -784,17 +749,13 @@ impl<'a> Visitor<'a> for Checker<'a> {
}) => {
let mut handled_exceptions = Exceptions::empty();
for type_ in extract_handled_exceptions(handlers) {
if let Some(qualified_name) = self.semantic.resolve_qualified_name(type_) {
match qualified_name.segments() {
["", "NameError"] => {
handled_exceptions |= Exceptions::NAME_ERROR;
}
["", "ModuleNotFoundError"] => {
if let Some(builtins_name) = self.semantic.resolve_builtin_symbol(type_) {
match builtins_name {
"NameError" => handled_exceptions |= Exceptions::NAME_ERROR,
"ModuleNotFoundError" => {
handled_exceptions |= Exceptions::MODULE_NOT_FOUND_ERROR;
}
["", "ImportError"] => {
handled_exceptions |= Exceptions::IMPORT_ERROR;
}
"ImportError" => handled_exceptions |= Exceptions::IMPORT_ERROR,
_ => {}
}
}
@@ -1002,6 +963,7 @@ impl<'a> Visitor<'a> for Checker<'a> {
ExprContext::Load => self.handle_node_load(expr),
ExprContext::Store => self.handle_node_store(id, expr),
ExprContext::Del => self.handle_node_delete(expr),
ExprContext::Invalid => {}
},
_ => {}
}
@@ -1046,19 +1008,11 @@ impl<'a> Visitor<'a> for Checker<'a> {
) => {
// Visit the default arguments, but avoid the body, which will be deferred.
if let Some(parameters) = parameters {
for ParameterWithDefault {
default,
parameter: _,
range: _,
} in parameters
.posonlyargs
.iter()
.chain(&parameters.args)
.chain(&parameters.kwonlyargs)
for default in parameters
.iter_non_variadic_params()
.filter_map(|param| param.default.as_deref())
{
if let Some(expr) = &default {
self.visit_expr(expr);
}
self.visit_expr(default);
}
}
@@ -1125,7 +1079,8 @@ impl<'a> Visitor<'a> for Checker<'a> {
]
) {
Some(typing::Callable::MypyExtension)
} else if matches!(qualified_name.segments(), ["", "bool"]) {
} else if matches!(qualified_name.segments(), ["" | "builtins", "bool"])
{
Some(typing::Callable::Bool)
} else {
None
@@ -1485,20 +1440,8 @@ impl<'a> Visitor<'a> for Checker<'a> {
// Step 1: Binding.
// Bind, but intentionally avoid walking default expressions, as we handle them
// upstream.
for parameter_with_default in &parameters.posonlyargs {
self.visit_parameter(&parameter_with_default.parameter);
}
for parameter_with_default in &parameters.args {
self.visit_parameter(&parameter_with_default.parameter);
}
if let Some(arg) = &parameters.vararg {
self.visit_parameter(arg);
}
for parameter_with_default in &parameters.kwonlyargs {
self.visit_parameter(&parameter_with_default.parameter);
}
if let Some(arg) = &parameters.kwarg {
self.visit_parameter(arg);
for parameter in parameters.iter().map(AnyParameterRef::as_parameter) {
self.visit_parameter(parameter);
}
// Step 4: Analysis
@@ -1570,8 +1513,8 @@ impl<'a> Visitor<'a> for Checker<'a> {
// Step 1: Binding
match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar { name, range, .. })
| ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { name, range })
| ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { name, range }) => {
| ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { name, range, .. })
| ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { name, range, .. }) => {
self.add_binding(
name.as_str(),
*range,
@@ -1581,13 +1524,46 @@ impl<'a> Visitor<'a> for Checker<'a> {
}
}
// Step 2: Traversal
if let ast::TypeParam::TypeVar(ast::TypeParamTypeVar {
bound: Some(bound), ..
}) = type_param
{
self.visit
.type_param_definitions
.push((bound, self.semantic.snapshot()));
match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar {
bound,
default,
name: _,
range: _,
}) => {
if let Some(expr) = bound {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
if let Some(expr) = default {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
}
ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple {
default,
name: _,
range: _,
}) => {
if let Some(expr) = default {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
}
ast::TypeParam::ParamSpec(ast::TypeParamParamSpec {
default,
name: _,
range: _,
}) => {
if let Some(expr) = default {
self.visit
.type_param_definitions
.push((expr, self.semantic.snapshot()));
}
}
}
}
@@ -1912,7 +1888,7 @@ impl<'a> Checker<'a> {
}
{
let (all_names, all_flags) =
extract_all_names(parent, |name| self.semantic.is_builtin(name));
extract_all_names(parent, |name| self.semantic.has_builtin_binding(name));
if all_flags.intersects(DunderAllFlags::INVALID_OBJECT) {
flags |= BindingFlags::INVALID_ALL_OBJECT;
@@ -2285,10 +2261,15 @@ pub(crate) fn check_ast(
} else {
ModuleKind::Module
},
source: if let Some(module_path) = module_path.as_ref() {
visibility::ModuleSource::Path(module_path)
name: if let Some(module_path) = &module_path {
module_path.last().map(String::as_str)
} else {
visibility::ModuleSource::File(path)
path.file_stem().and_then(std::ffi::OsStr::to_str)
},
source: if let Some(module_path) = module_path.as_ref() {
ModuleSource::Path(module_path)
} else {
ModuleSource::File(path)
},
python_ast,
};

View File

@@ -39,7 +39,7 @@ fn extract_import_map(path: &Path, package: Option<&Path>, blocks: &[&Block]) ->
level,
range: _,
}) => {
let level = level.unwrap_or_default() as usize;
let level = *level as usize;
let module = if let Some(module) = module {
let module: &String = module.as_ref();
if level == 0 {

View File

@@ -3,17 +3,19 @@
use std::path::Path;
use itertools::Itertools;
use ruff_text_size::Ranged;
use rustc_hash::FxHashSet;
use ruff_diagnostics::{Diagnostic, Edit, Fix};
use ruff_python_trivia::CommentRanges;
use ruff_source_file::Locator;
use ruff_text_size::Ranged;
use crate::fix::edits::delete_comment;
use crate::noqa;
use crate::noqa::{Directive, FileExemption, NoqaDirectives, NoqaMapping};
use crate::noqa::{Code, Directive, FileExemption, NoqaDirectives, NoqaMapping};
use crate::registry::{AsRule, Rule, RuleSet};
use crate::rule_redirects::get_redirect_target;
use crate::rules::pygrep_hooks;
use crate::rules::ruff;
use crate::rules::ruff::rules::{UnusedCodes, UnusedNOQA};
use crate::settings::LinterSettings;
@@ -84,7 +86,7 @@ pub(crate) fn check_noqa(
true
}
Directive::Codes(directive) => {
if noqa::includes(diagnostic.kind.rule(), directive.codes()) {
if directive.includes(diagnostic.kind.rule()) {
directive_line
.matches
.push(diagnostic.kind.rule().noqa_code());
@@ -111,6 +113,7 @@ pub(crate) fn check_noqa(
FileExemption::All => true,
FileExemption::Codes(codes) => codes.contains(&Rule::UnusedNOQA.noqa_code()),
})
&& !per_file_ignores.contains(Rule::UnusedNOQA)
{
for line in noqa_directives.lines() {
match &line.directive {
@@ -126,33 +129,37 @@ pub(crate) fn check_noqa(
}
Directive::Codes(directive) => {
let mut disabled_codes = vec![];
let mut duplicated_codes = vec![];
let mut unknown_codes = vec![];
let mut unmatched_codes = vec![];
let mut valid_codes = vec![];
let mut self_ignore = per_file_ignores.contains(Rule::UnusedNOQA);
for code in directive.codes() {
let code = get_redirect_target(code).unwrap_or(code);
let mut seen_codes = FxHashSet::default();
let mut self_ignore = false;
for original_code in directive.iter().map(Code::as_str) {
let code = get_redirect_target(original_code).unwrap_or(original_code);
if Rule::UnusedNOQA.noqa_code() == code {
self_ignore = true;
break;
}
if line.matches.iter().any(|match_| *match_ == code)
if !seen_codes.insert(original_code) {
duplicated_codes.push(original_code);
} else if line.matches.iter().any(|match_| *match_ == code)
|| settings
.external
.iter()
.any(|external| code.starts_with(external))
{
valid_codes.push(code);
valid_codes.push(original_code);
} else {
if let Ok(rule) = Rule::from_code(code) {
if settings.rules.enabled(rule) {
unmatched_codes.push(code);
unmatched_codes.push(original_code);
} else {
disabled_codes.push(code);
disabled_codes.push(original_code);
}
} else {
unknown_codes.push(code);
unknown_codes.push(original_code);
}
}
}
@@ -162,6 +169,7 @@ pub(crate) fn check_noqa(
}
if !(disabled_codes.is_empty()
&& duplicated_codes.is_empty()
&& unknown_codes.is_empty()
&& unmatched_codes.is_empty())
{
@@ -172,6 +180,10 @@ pub(crate) fn check_noqa(
.iter()
.map(|code| (*code).to_string())
.collect(),
duplicated: duplicated_codes
.iter()
.map(|code| (*code).to_string())
.collect(),
unknown: unknown_codes
.iter()
.map(|code| (*code).to_string())
@@ -202,6 +214,14 @@ pub(crate) fn check_noqa(
}
}
if settings.rules.enabled(Rule::BlanketNOQA) {
pygrep_hooks::rules::blanket_noqa(diagnostics, &noqa_directives, locator);
}
if settings.rules.enabled(Rule::RedirectedNOQA) {
ruff::rules::redirected_noqa(diagnostics, &noqa_directives);
}
ignored_diagnostics.sort_unstable();
ignored_diagnostics
}

Some files were not shown because too many files have changed in this diff Show More