Compare commits

...

294 Commits

Author SHA1 Message Date
Jane Lewis
7f9c31720b Apply ruff-vscode settings suggestion 2024-06-28 06:29:32 -07:00
Jane Lewis
6034a4e0ba Apply suggestions
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-06-28 06:27:37 -07:00
Jane Lewis
e7a7da9ae9 Reduce all headings and re-add a single top-level heading 2024-06-28 05:39:12 -07:00
Jane Lewis
2fda0038d8 Remove unavailable link 2024-06-26 15:32:06 -07:00
Jane Lewis
800ff0a693 Update Integrations page to cover ruff server 2024-06-26 15:28:43 -07:00
baggiponte
55f4812051 docs: add and formatter to CLI startup message (#12042)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-06-26 10:57:10 +00:00
Dhruv Manilawala
47c9ed07f2 Consider 2-character EOL before line continuation (#12035)
## Summary

This PR fixes a bug introduced in
https://github.com/astral-sh/ruff/pull/12008 which didn't consider the
two character newline after the line continuation character.

For example, consider the following code highlighted with whitespaces:
```py
call(foo # comment \\r\n
\r\n
def bar():\r\n
....pass\r\n
```
The lexer is at `def` when it's running the re-lexing logic and trying
to move back to a newline character. It encounters `\n` and it's being
escaped (incorrect) but `\r` is being escaped, so it moves the lexer to
`\n` character. This creates an overlap in token ranges which causes the
panic.

```
Name 0..4
Lpar 4..5
Name 5..8
Comment 9..20
NonLogicalNewline 20..22 <-- overlap between
Newline 21..22           <-- these two tokens
NonLogicalNewline 22..23
Def 23..26
...
```

fixes: #12028 

## Test Plan

Add a test case with line continuation and windows style newline
character.
2024-06-26 14:00:48 +05:30
Dhruv Manilawala
7cb2619ef5 Add syntax error for empty type parameter list (#12030)
## Summary

(I'm pretty sure I added this in the parser re-write but must've got
lost in the rebase?)

This PR raises a syntax error if the type parameter list is empty.

As per the grammar, there should be at least one type parameter:
```
type_params: 
    | invalid_type_params
    | '[' type_param_seq ']' 

type_param_seq: ','.type_param+ [','] 
```

Verified via the builtin `ast` module as well:
```console    
$ python3.13 -m ast parser/_.py
Traceback (most recent call last):
  [..]
  File "parser/_.py", line 1
    def foo[]():
            ^
SyntaxError: Type parameter list cannot be empty
```

## Test Plan

Add inline test cases and update the snapshots.
2024-06-26 08:10:35 +05:30
Charlie Marsh
83fe44728b Match import name ignores against both name and alias (#12033)
## Summary

Right now, it's inconsistent... We sometimes match against the name, and
sometimes against the alias (`asname`). I could see a case for always
matching against the name, but matching against both seems fine too,
since the rule is really about the combination of the two?

Closes https://github.com/astral-sh/ruff/issues/12031.
2024-06-25 18:47:19 -04:00
Alex Waygood
00e456ead4 Fix RUF027 false positives if gettext is imported using an alias (#12025) 2024-06-25 19:10:25 +01:00
Dhruv Manilawala
2853751344 Avoid E203 for f-string debug expression (#12024)
## Summary

This PR fixes a bug where Ruff would raise `E203` for f-string debug
expression. This isn't valid because whitespaces are important for debug
expressions.

fixes: #12023

## Test Plan

Add test case and make sure there are no snapshot changes.
2024-06-25 15:00:31 +05:30
Dhruv Manilawala
7109214b57 Update parser tests to validate token ranges (#12019)
## Summary

This PR updates the parser test infrastructure to validate the token
ranges.

From the code documentation:
```
/// Verifies that:
/// * the ranges are strictly increasing when loop the tokens in insertion order
/// * all ranges are within the length of the source code
```

Follow-up from #12016 and #12017
resolves: #11938

## Test Plan

Make sure that there are no failures.
2024-06-25 08:14:28 +00:00
Dhruv Manilawala
d930e97212 Do not include newline for unterminated string range (#12017)
## Summary

This PR updates the unterminated string error range to not include the
final newline character.

This is a follow-up to #12016 and required for #12019

This is not done for when the unterminated string goes till the end of
file (not a newline character). The unterminated f-string range is
correct.

### Why is this required for #12019 ?

Because otherwise the token ranges will overlap. For example:
```py
f"{"
f"{foo!r"
```

Here, the re-lexing logic recovers from an unterminated f-string and
thus emitting a `Newline` token for the one at the end of the first
line. But, currently the `Unknown` and the `Newline` token would overlap
because the `Unknown` token (unterminated string literal) range would
include the newline character.

## Test Plan

Update and validate the snapshot.
2024-06-25 08:10:07 +00:00
Dhruv Manilawala
9c1b6ec411 Use correct range to highlight line continuation error (#12016)
## Summary

This PR fixes the range highlighted for the line continuation error.

Previously, it would highlight an incorrect range:
```
1 | call(a, b, \\\
  |           ^^ Syntax Error: unexpected character after line continuation character
2 | 
3 | def bar():
  |
```

And now:
```
  |
1 | call(a, b, \\\
  |             ^ Syntax Error: unexpected character after line continuation character
2 | 
3 | def bar():
  |
```

This is implemented by avoiding to update the token range for the
`Unknown` token which is emitted when there's a lexical error. Instead,
the `push_error` helper method will be responsible to update the range
to the error location.

This actually becomes a requirement which can be seen in follow-up PRs.

## Test Plan

Update and validate the snapshot.
2024-06-25 13:35:24 +05:30
Micha Reiser
692309ebd7 [red-knot] Fix tests in release builds (#12022) 2024-06-25 06:34:35 +00:00
Dhruv Manilawala
68a8978454 Consider line continuation character for re-lexing (#12008)
## Summary

This PR fixes a bug where the re-lexing logic didn't consider the line
continuation character being present before the newline character. This
meant that the lexer was being moved back to the newline character which
is actually ignored via `\`.

Considering the following code:
```py
f'middle {'string':\
        'format spec'}

```

The old token stream is:
```
...
Colon 18..19
FStringMiddle 19..29 (flags = F_STRING)
Newline 20..21
Indent 21..29
String 29..42
Rbrace 42..43
...
```

Notice how the ranges are overlapping between the `FStringMiddle` token
and the tokens emitted after moving the lexer backwards.

After this fix, the new token stream which is without moving the lexer
backwards in this scenario:
```
FStringStart 0..2 (flags = F_STRING)
FStringMiddle 2..9 (flags = F_STRING)
Lbrace 9..10
String 10..18
Colon 18..19
FStringMiddle 19..29 (flags = F_STRING)
FStringEnd 29..30 (flags = F_STRING)
Name 30..36
Name 37..41
Unknown 41..44
Newline 44..45
```

fixes: #12004 

## Test Plan

Add test cases and update the snapshots.
2024-06-25 02:13:54 +00:00
Alex Waygood
cd2af3be73 [red-knot] Reduce allocations when normalizing VendoredPaths (#11992) 2024-06-24 13:08:01 +01:00
Micha Reiser
e2e98d005c Fix missing related settings header (#12013) 2024-06-24 12:29:10 +02:00
renovate[bot]
32ccc38365 Update NPM Development dependencies (#11999) 2024-06-23 20:50:01 -04:00
renovate[bot]
35151080b1 Update pre-commit dependencies (#11998) 2024-06-23 20:49:55 -04:00
renovate[bot]
53a80a5c11 Update Rust crate rustc-hash to v2 (#12001) 2024-06-23 20:46:42 -04:00
renovate[bot]
446ad0ba44 Update docker/build-push-action action to v6 (#12002) 2024-06-24 00:29:47 +00:00
renovate[bot]
d897811f00 Update Rust crate mimalloc to v0.1.43 (#11993) 2024-06-23 20:29:02 -04:00
renovate[bot]
5d6b26ed33 Update dependency monaco-editor to ^0.50.0 (#12000) 2024-06-23 20:28:52 -04:00
renovate[bot]
49e5357dac Update Rust crate syn to v2.0.68 (#11996) 2024-06-24 00:21:36 +00:00
renovate[bot]
e02c44e46c Update Rust crate url to v2.5.2 (#11997) 2024-06-24 00:21:23 +00:00
renovate[bot]
86cbf2d594 Update Rust crate strum to v0.26.3 (#11995) 2024-06-24 00:19:51 +00:00
renovate[bot]
b79f1ed7f5 Update Rust crate proc-macro2 to v1.0.86 (#11994) 2024-06-24 00:19:06 +00:00
ukyen
068b75cc8e [pyflakes] Detect assignments that shadow definitions (F811) (#11961)
## Summary
This PR updates `F811` rule to include assignment as possible shadowed
binding. This will fix issue: #11828 .

## Test Plan

Add a test file, F811_30.py, which includes a redefinition after an
assignment and a verified snapshot file.
2024-06-23 13:29:32 -04:00
Denny Wong
c3f61a012e [ruff] Add assert-with-print-expression rule (#11974) (#11981)
## Summary

Addresses #11974 to add a `RUF` rule to replace `print` expressions in
`assert` statements with the inner message.

An autofix is available, but is considered unsafe as it changes
behaviour of the execution, notably:
- removal of the printout in `stdout`, and
- `AssertionError` instance containing a different message.

While the detection of the condition is a straightforward matter,
deciding how to resolve the print arguments into a string literal can be
a relatively subjective matter. The implementation of this PR chooses to
be as tolerant as possible, and will attempt to reformat any number of
`print` arguments containing single or concatenated strings or variables
into either a string literal, or a f-string if any variables or
placeholders are detected.

## Test Plan

`cargo test`.

## Examples
For ease of discussion, this is the diff for the tests:

```diff
 # Standard Case
 # Expects:
 # - single StringLiteral
-assert True, print("This print is not intentional.")
+assert True, "This print is not intentional."
 
 # Concatenated string literals
 # Expects:
 # - single StringLiteral
-assert True, print("This print" " is not intentional.")
+assert True, "This print is not intentional."
 
 # Positional arguments, string literals
 # Expects:
 # - single StringLiteral concatenated with " "
-assert True, print("This print", "is not intentional")
+assert True, "This print is not intentional"
 
 # Concatenated string literals combined with Positional arguments
 # Expects:
 # - single stringliteral concatenated with " " only between `print` and `is`
-assert True, print("This " "print", "is not intentional.")
+assert True, "This print is not intentional."
 
 # Positional arguments, string literals with a variable
 # Expects:
 # - single FString concatenated with " "
-assert True, print("This", print.__name__, "is not intentional.")
+assert True, f"This {print.__name__} is not intentional."

 # Mixed brackets string literals
 # Expects:
 # - single StringLiteral concatenated with " "
-assert True, print("This print", 'is not intentional', """and should be removed""")
+assert True, "This print is not intentional and should be removed"
 
 # Mixed brackets with other brackets inside
 # Expects:
 # - single StringLiteral concatenated with " " and escaped brackets
-assert True, print("This print", 'is not "intentional"', """and "should" be 'removed'""")
+assert True, "This print is not \"intentional\" and \"should\" be 'removed'"
 
 # Positional arguments, string literals with a separator
 # Expects:
 # - single StringLiteral concatenated with "|"
-assert True, print("This print", "is not intentional", sep="|")
+assert True, "This print|is not intentional"
 
 # Positional arguments, string literals with None as separator
 # Expects:
 # - single StringLiteral concatenated with " "
-assert True, print("This print", "is not intentional", sep=None)
+assert True, "This print is not intentional"
 
 # Positional arguments, string literals with variable as separator, needs f-string
 # Expects:
 # - single FString concatenated with "{U00A0}"
-assert True, print("This print", "is not intentional", sep=U00A0)
+assert True, f"This print{U00A0}is not intentional"
 
 # Unnecessary f-string
 # Expects:
 # - single StringLiteral
-assert True, print(f"This f-string is just a literal.")
+assert True, "This f-string is just a literal."
 
 # Positional arguments, string literals and f-strings
 # Expects:
 # - single FString concatenated with " "
-assert True, print("This print", f"is not {'intentional':s}")
+assert True, f"This print is not {'intentional':s}"
 
 # Positional arguments, string literals and f-strings with a separator
 # Expects:
 # - single FString concatenated with "|"
-assert True, print("This print", f"is not {'intentional':s}", sep="|")
+assert True, f"This print|is not {'intentional':s}"
 
 # A single f-string
 # Expects:
 # - single FString
-assert True, print(f"This print is not {'intentional':s}")
+assert True, f"This print is not {'intentional':s}"
 
 # A single f-string with a redundant separator
 # Expects:
 # - single FString
-assert True, print(f"This print is not {'intentional':s}", sep="|")
+assert True, f"This print is not {'intentional':s}"
 
 # Complex f-string with variable as separator
 # Expects:
 # - single FString concatenated with "{U00A0}", all placeholders preserved
 condition = "True is True"
 maintainer = "John Doe"
-assert True, print("Unreachable due to", condition, f", ask {maintainer} for advice", sep=U00A0)
+assert True, f"Unreachable due to{U00A0}{condition}{U00A0}, ask {maintainer} for advice"
 
 # Empty print
 # Expects:
 # - `msg` entirely removed from assertion
-assert True, print()
+assert True
 
 # Empty print with separator
 # Expects:
 # - `msg` entirely removed from assertion
-assert True, print(sep=" ")
+assert True
 
 # Custom print function that actually returns a string
 # Expects:
@@ -100,4 +100,4 @@
 # Use of `builtins.print`
 # Expects:
 # - single StringLiteral
-assert True, builtins.print("This print should be removed.")
+assert True, "This print should be removed."
```

## Known Issues

The current implementation resolves all arguments and separators of the
`print` expression into a single string, be it
`StringLiteralValue::single` or a `FStringValue::single`. This:

- potentially joins together strings well beyond the ideal character
limit for each line, and
- does not preserve multi-line strings in their original format, in
favour of a single line `"...\n...\n..."` format.

These are purely formatting issues only occurring in unusual scenarios.

Additionally, the autofix will tolerate `print` calls that were
previously invalid:

```python
assert True, print("this", "should not be allowed", sep=42)
```

This will be transformed into
```python
assert True, f"this{42}should not be allowed"
```
which some could argue is an alteration of behaviour.

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-06-23 16:54:55 +00:00
Gilles Peiffer
0c8b5eb17a Clarify special control flow parameters for PLR0917: too-many-positional (#11978) 2024-06-23 11:16:09 -04:00
Alex Waygood
375d2c87b2 [red-knot] Simplify conversions from std::path::Path to VendoredPath(Buf) (#11988) 2024-06-23 15:52:26 +01:00
Alex Waygood
f846fc9e07 [red-knot] Once again, add more tests asserting that the VendoredFileSystem and the VERSIONS parser work with the vendored typeshed stubs (#11987) 2024-06-23 14:57:43 +01:00
Alex Waygood
92b145e56a [red-knot] Manually implement Debug for VendoredFileSystem (#11983) 2024-06-23 14:25:56 +01:00
Eric Nielsen
715609663a Update PEP reference in future_rewritable_type_annotation.rs (#11985)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Documentation mentions:

> PEP 563 enabled the use of a number of convenient type annotations,
such as `list[str]` instead of `List[str]`

but it meant [PEP 585](https://peps.python.org/pep-0585/) instead.

[PEP 563](https://peps.python.org/pep-0563/) is the one defining `from
__future__ import annotations`.

## Test Plan

No automated test required, just verify that
https://peps.python.org/pep-0585/ is the correct reference.
2024-06-22 20:15:12 -05:00
Micha Reiser
519a278899 [red-knot] Remove itertools dependency from ruff_db (#11984) 2024-06-22 18:37:51 +00:00
Alex Waygood
91d091bb81 [red-knot] Use POSIX representations of paths when creating the typeshed zip file (#11982) 2024-06-22 17:54:19 +01:00
Rune Lausen
79d72e6479 docs(integrations): fix link to python-lsp-server (#11980)
Co-authored-by: Rune Lausen <rune@lausennet.dk>
2024-06-22 13:17:50 +01:00
Dhruv Manilawala
81160320de Manual impl of Debug on Token (#11958)
## Summary

I look at the token stream a lot, not specifically in the playground but
in the terminal output and it's annoying to scroll a lot to find
specific location. Most of the information is also redundant.

The final format we end up with is: `<kind> <range> (flags = ...)` e.g.,
`String 0..4 (flags = BYTE_STRING)` where the flags part is only
populated if there are any flags set.
2024-06-22 04:18:24 +00:00
R1kaB3rN
b1e7bf76da Add Open Wine Components to "Who's Using Ruff?" (#11976) 2024-06-21 19:59:40 +00:00
Jane Lewis
ad4a88657b Remove usage of std::path::absolute from snapshot test (#11973) 2024-06-21 20:21:12 +01:00
Alex Waygood
611f4e5c5f Revert "[red-knot] Add more tests asserting that the VendoredFileSystem and the VERSIONS parser work with the vendored typeshed stubs" (#11975) 2024-06-21 19:14:24 +00:00
Jane Lewis
791f6a1820 ruff server: Closing an untitled, unsaved notebook document no longer throws an error (#11942)
## Summary

Fixes #11651.
Fixes #11851.

We were double-closing a notebook document from the index, once in
`textDocument/didClose` and then in the `notebookDocument/didClose`
handler. The second time this happens, taking a snapshot fails.

I've rewritten how we handle snapshots for closing notebooks / notebook
cells so that any failure is simply logged instead of propagating
upwards. This implementation works consistently even if we don't receive
`textDocument/didClose` notifications for each specific cell, since they
get closed (and the diagnostics get cleared) in the notebook document
removal process.

## Test Plan

1. Open an untitled, unsaved notebook with the `Create: New Jupyter
Notebook` command from the VS Code command palette (`Ctrl/Cmd + Shift +
P`)
2. Without saving the document, close it.
3. No error popup should appear.
4. Run the debug command (`Ruff: print debug information`) to confirm
that there are no open documents
2024-06-21 10:53:30 -07:00
Alex Waygood
3d0230f469 [red-knot] Add more tests asserting that the VendoredFileSystem and the VERSIONS parser work with the vendored typeshed stubs (#11970) 2024-06-21 16:53:10 +00:00
Alex Waygood
da79bac33c [red-knot] Make the VERSIONS parser use ModuleName as its key type (#11968) 2024-06-21 15:46:45 +00:00
Alex Waygood
8de0cd6565 [red-knot] Move typeshed VERSIONS parser to the module resolver crate (#11967) 2024-06-21 16:41:08 +01:00
Alex Waygood
3277d031f8 [red-knot] Move the vendored typeshed stubs to the module resolver crate (#11966) 2024-06-21 13:47:54 +00:00
Alex Waygood
736a4ead14 [red-knot] Move module-resolution logic to its own crate (#11964) 2024-06-21 13:25:44 +00:00
Dhruv Manilawala
27ebff36ec Remove Token::is_trivia method (#11962)
Sorry, a leftover from my rebase
2024-06-21 10:24:42 +00:00
Dhruv Manilawala
96da136e6a Move token and error structs into related modules (#11957)
## Summary

This PR does some housekeeping into moving certain structs into related
modules. Specifically,
1. Move `LexicalError` from `lexer.rs` to `error.rs` which also contains
the `ParseError`
2. Move `Token`, `TokenFlags` and `TokenValue` from `lexer.rs` to
`token.rs`
2024-06-21 10:07:19 +00:00
Dhruv Manilawala
4667d8697c Remove duplication around is_trivia functions (#11956)
## Summary

This PR removes the duplication around `is_trivia` functions.

There are two of them in the codebase:
1. In `pycodestyle`, it's for newline, indent, dedent, non-logical
newline and comment
2. In the parser, it's for non-logical newline and comment

The `TokenKind::is_trivia` method used (1) but that's not correct in
that context. So, this PR introduces a new `is_non_logical_token` helper
method for the `pycodestyle` crate and updates the
`TokenKind::is_trivia` implementation with (2).

This also means we can remove `Token::is_trivia` method and the
standalone `token_source::is_trivia` function and use the one on
`TokenKind`.

## Test Plan

`cargo insta test`
2024-06-21 10:02:40 +00:00
Will Yardley
690e94f4fb ruff-check: update docs for fix_only (#11959) 2024-06-21 08:13:04 +02:00
dedebenui
9fd84e63bc Update trapz and in1d deprecation for NPY201 (#11948) 2024-06-21 08:08:00 +02:00
Jane Lewis
3ab7a8da73 Add Jupyter Notebook document change snapshot test (#11944)
## Summary

Closes #11914.

This PR introduces a snapshot test that replays the LSP requests made
during a document formatting request, and confirms that the notebook
document is updated in the expected way.
2024-06-21 05:29:27 +00:00
Micha Reiser
927069c12f [red-knot] Upgrade to Salsa 3.0 (#11952) 2024-06-20 20:19:16 +01:00
Jane Lewis
c8ff89c73c ruff server: Support the usage of tildes and environment variables in logFile (#11945)
## Summary

Fixes #11911.

`shellexpand` is now used on `logFile` to expand the file path, allowing
the usage of `~` and environment variables.

## Test Plan

1. Set `logFile` in either Neovim or Helix to a file path that needs
expansion, like `~/.config/helix/ruff_logs.txt`.
2. Ensure that `RUFF_TRACE` is set to `messages` or `verbose`
3. Open a Python file in Neovim/Helix
4. Confirm that a file at the path specified was created, with the
expected logs.
2024-06-20 18:51:46 +00:00
Dhruv Manilawala
4c05d7a6d4 Provide link on how to re-run all failed jobs (#11954) 2024-06-20 23:19:43 +05:30
Dhruv Manilawala
b54922fd73 Bump version to v0.4.10 (#11953) 2024-06-20 22:37:44 +05:30
Dhruv Manilawala
3f884b4b34 Avoid running logical line rule logic if not enabled (#11951)
## Summary

This PR updates the logical line rules entry-point function to only run
the logic if any of the rules within that group is enabled.

Although this shouldn't really give any performance improvements, it's
better not to do additional work if we can. This is also consistent with
how other rules are run.

## Test Plan

`cargo insta test`
2024-06-20 16:28:53 +00:00
Micha Reiser
b456051be8 [red-knot] Add tracing to Salsa queries (#11949) 2024-06-20 13:33:41 +02:00
Micha Reiser
2dfbf118d7 [red-knot] Extract red_knot_python_semantic crate (#11926) 2024-06-20 13:24:24 +02:00
Dhruv Manilawala
ed948eaefb Avoid moving back the lexer for triple-quoted fstring (#11939)
## Summary

This PR avoids moving back the lexer for a triple-quoted f-string during
the re-lexing phase.

The reason this is a problem is that for a triple-quoted f-string the
newlines are part of the f-string itself, specifically they'll be part
of the `FStringMiddle` token. So, if we moved the lexer back, there
would be a `Newline` token whose range would be in between an
`FStringMiddle` token. This creates a panic in downstream usage.

fixes: #11937 

## Test Plan

Add test cases and validate the snapshots.
2024-06-20 16:27:36 +05:30
Micha Reiser
22733cb7c7 red-knot(Salsa): Types without refinements (#11899) 2024-06-20 12:49:38 +02:00
Dhruv Manilawala
a26bd01be2 Avoid depth counting when detecting indentation (#11947)
## Summary

This PR avoids the `depth` counter when detecting indentation from
non-logical lines because it seems to never be used. It might have been
a leftover when the logic was added originally in #11608.

## Test Plan

`cargo insta test`
2024-06-20 10:42:35 +05:30
Dhruv Manilawala
b617d90651 Update E999 to show all syntax errors (#11900)
## Summary

This PR updates the linter to show all the parse errors as diagnostics
instead of just the first one.

Note that this doesn't affect the parse error displayed as error log
message. This will be removed in a follow-up PR.

### Breaking?

I don't think this is a breaking change even though this might give more
diagnostics. The main reason is that this shouldn't affect any users
because it'll only give additional diagnostics in the case of multiple
syntax errors.

## Test Plan

Add an integration test case which would raise more than one parse
error.
2024-06-19 13:09:54 +05:30
Dhruv Manilawala
cdc7c71449 Avoid consuming trailing whitespace during re-lexing (#11933)
## Summary

This PR updates the re-lexing logic to avoid consuming the trailing
whitespace and move the lexer explicitly to the last newline character
encountered while moving backwards.

Consider the following code snippet as taken from the test case
highlighted with whitespace (`.`) and newline (`\n`) characters:
```py
# There are trailing whitespace before the newline character but those whitespaces are
# part of the comment token
f"""hello {x # comment....\n
#                     ^
y = 1\n
```

The parser is at `y` when it's trying to recover from an unclosed `{`,
so it calls into the re-lexing logic which tries to move the lexer back
to the end of the previous line. But, as it consumed all whitespaces it
moved the lexer to the location marked by `^` in the above code snippet.
But, those whitespaces are part of the comment token. This means that
the range for the two tokens were overlapping which introduced the
panic.

Note that this is only a bug when there's a comment with a trailing
whitespace otherwise it's fine to move the lexer to the whitespace
character. This is because the lexer would just skip the whitespace
otherwise. Nevertheless, this PR updates the logic to move it explicitly
to the newline character in all cases.

fixes: #11929 

## Test Plan

Add test cases and update the snapshot. Make sure that it doesn't panic
on the code snippet in the linked issue.
2024-06-19 12:14:18 +05:30
Jane Lewis
ff3bf583b2 ruff server: Add tracing setup guide to Neovim documentation (#11884)
A follow-up to [this
suggestion](https://github.com/astral-sh/ruff/pull/11747#discussion_r1634297757)
on the tracing PR.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-06-18 13:39:41 -07:00
Adrin Jalali
2e7c3454e0 ENH copyright-notice: check in the first 4096 bytes instead of 1024 (#11927)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary
related to https://github.com/astral-sh/ruff/issues/5306

The check right now only checks in the first 1024 bytes, and that's
really not enough when there's a docstring at the beginning of a file.

A more proper fix might be needed, which might be more complex (and I
don't have the `rust` skills to implement that). But this temporary
"fix" might enable more users to use this.

Context: We want to use this rule in
https://github.com/scikit-learn/scikit-learn/ and we got blocked because
of this hardcoded rule (which TBH took us quite a while to figure out
why it was failing since it's not documented).

## Test Plan

This is already kinda tested, modified the test for the new byte number.

<!-- How was it tested? -->
2024-06-18 11:04:34 -05:00
Alex Waygood
1d73d60bd3 [red-knot]: Add a VendoredFileSystem implementation (#11863)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-06-18 15:43:39 +00:00
Micha Reiser
f666d79cd7 red-knot: Symbol table (#11860) 2024-06-18 13:10:45 +00:00
Micha Reiser
26ac805e6d red-knot: Port module resolver to salsa (#11835) 2024-06-18 12:11:58 +00:00
Micha Reiser
98b13b9844 red-knot: Add a method to resolve a file for an arbitrary VfsPath (#11826)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-18 12:03:30 +00:00
Dhruv Manilawala
13ad24b13e Avoid syntax errors for test cases (#11923)
## Summary

This PR removes most of the syntax errors from the test cases. This
would create noise when https://github.com/astral-sh/ruff/pull/11901 is
complete. These syntax errors are also just noise for the test itself.

## Test Plan

Update the snapshots and verify that they're still the same.
2024-06-18 17:16:27 +05:30
psychedelicious
104608b2f7 Update docs for E711, E712 (#4560) (#11859) 2024-06-18 11:20:37 +01:00
Dhruv Manilawala
1e0642fac8 Use re-lexing for normal list parsing (#11871)
## Summary

This PR is a follow-up on #11845 to add the re-lexing logic for normal
list parsing.

A normal list parsing is basically parsing elements without any
separator in between i.e., there can only be trivia tokens in between
the two elements. Currently, this is only being used for parsing
**assignment statement** and **f-string elements**. Assignment
statements cannot be in a parenthesized context, but f-string can have
curly braces so this PR is specifically for them.

I don't think this is an ideal recovery but the problem is that both
lexer and parser could add an error for f-strings. If the lexer adds an
error it'll emit an `Unknown` token instead while the parser adds the
error directly. I think we'd need to move all f-string errors to be
emitted by the parser instead. This way the parser can correctly inform
the lexer that it's out of an f-string and then the lexer can pop the
current f-string context out of the stack.

## Test Plan

Add test cases, update the snapshots, and run the fuzzer.
2024-06-18 12:14:41 +05:30
Micha Reiser
5e5a81b05f Fix Fuzz build (#11919) 2024-06-18 06:44:19 +00:00
Jane Lewis
c53d55a483 ruff server: Add tracing setup guide to Helix documentation (#11883)
A follow-up to [this
suggestion](https://github.com/astral-sh/ruff/pull/11747#discussion_r1634297757)
on the tracing PR.
2024-06-18 03:41:24 +00:00
Jane Lewis
ffc98522cd ruff server: Defer notebook cell deletion to avoid an error message (#11864)
## Summary

Fixes https://github.com/astral-sh/ruff-vscode/issues/496.

Cells are no longer removed from the notebook index when a notebook gets
updated, but rather when `textDocument/didClose` is called for them.
This solves an issue where their premature removal from the notebook
cell index would cause their URL to be un-queryable in the
`textDocument/didClose` handler.

## Test Plan

Create and then delete a notebook cell in VS Code. No error should
appear.
2024-06-18 03:37:40 +00:00
Dhruv Manilawala
8499abfa7f Implement re-lexing logic for better error recovery (#11845)
## Summary

This PR implements the re-lexing logic in the parser.

This logic is only applied when recovering from an error during list
parsing. The logic is as follows:
1. During list parsing, if an unexpected token is encountered and it
detects that an outer context can understand it and thus recover from
it, it invokes the re-lexing logic in the lexer
2. This logic first checks if the lexer is in a parenthesized context
and returns if it's not. Thus, the logic is a no-op if the lexer isn't
in a parenthesized context
3. It then reduces the nesting level by 1. It shouldn't reset it to 0
because otherwise the recovery from nested list parsing will be
incorrect
4. Then, it tries to find last newline character going backwards from
the current position of the lexer. This avoids any whitespaces but if it
encounters any character other than newline or whitespace, it aborts.
5. Now, if there's a newline character, then it needs to be re-lexed in
a logical context which means that the lexer needs to emit it as a
`Newline` token instead of `NonLogicalNewline`.
6. If the re-lexing gives a different token than the current one, the
token source needs to update it's token collection to remove all the
tokens which comes after the new current position.

It turns out that the list parsing isn't that happy with the results so
it requires some re-arranging such that the following two errors are
raised correctly:
1. Expected comma
2. Recovery context error

For (1), the following scenarios needs to be considered:
* Missing comma between two elements
* Half parsed element because the grammar doesn't allow it (for example,
named expressions)

For (2), the following scenarios needs to be considered:
1. If the parser is at a comma which means that there's a missing
element otherwise the comma would've been consumed by the first `eat`
call above. And, the parser doesn't take the re-lexing route on a comma
token.
2. If it's the first element and the current token is not a comma which
means that it's an invalid element.

resolves: #11640 

## Test Plan

- [x] Update existing test snapshots and validate them
- [x] Add additional test cases specific to the re-lexing logic and
validate the snapshots
- [x] Run the fuzzer on 3000+ valid inputs
- [x] Run the fuzzer on invalid inputs
- [x] Run the parser on various open source projects
- [x] Make sure the ecosystem changes are none
2024-06-17 06:47:00 +00:00
Micha Reiser
1f654ee729 Upgrade to Rust 1.79 (#11875) 2024-06-17 07:15:10 +01:00
Dhruv Manilawala
355d26f05c Use correct comment character for bash script in CI (#11896)
This should fix
https://github.com/astral-sh/ruff/actions/runs/9542715937/job/26298008128
2024-06-17 06:09:20 +00:00
renovate[bot]
027ea899ce Update NPM Development dependencies (#11893)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-06-17 07:58:37 +02:00
renovate[bot]
61c568268a Update dawidd6/action-download-artifact action to v6 (#11894)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-06-17 07:55:47 +02:00
renovate[bot]
01754a1209 Update pre-commit dependencies (#11892)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-06-17 07:55:10 +02:00
renovate[bot]
e684f6b1e0 Update Rust crate url to v2.5.1 (#11891)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-06-17 07:54:52 +02:00
renovate[bot]
142eda7dc6 Update Rust crate memchr to v2.7.4 (#11890)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-06-17 07:54:01 +02:00
renovate[bot]
d9c0590169 Update Rust crate clap to v4.5.7 (#11889)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-06-17 07:52:34 +02:00
Dhruv Manilawala
f8f0053a6c Trim trailing whitespace in server debug message (#11895) 2024-06-17 05:46:08 +00:00
github-actions[bot]
e7c4d28c5e Sync vendored typeshed stubs (#11885) 2024-06-15 02:15:19 +01:00
Zanie Blue
19cd9d7d8a Use https by default in schema store update script (#11882) 2024-06-14 17:30:54 -05:00
Zanie Blue
c50577f1d7 Fix prettier formatting in schema store update script (#11881)
How was this working for anyone else? The `prettier` path did not exist
on my machine. Also added `--force` to the push because otherwise you
can't re-run the script for a given Ruff commit.
2024-06-14 16:56:32 -05:00
Zanie Blue
2d6d85e993 Guard against malicious ecosystem comment artifacts (#11879) 2024-06-14 12:11:25 -05:00
Dhruv Manilawala
4f49e918a9 Bump version to v0.4.9 (#11872) 2024-06-14 20:36:22 +05:30
Dhruv Manilawala
d681a45b08 Make ruff_db a required crate for ruff_python_semantic (#11874)
## Summary

This PR makes the `ruff_db` a required crate for `ruff_python_semantic`.

Refer
https://github.com/astral-sh/ruff/actions/runs/9516626143/job/26233307158?pr=11872

## Test Plan

1. `maturin sdist --out dist`
2. `tar -xf dist/ruff-0.4.8.tar.gz --directory=dist/ruff-0.4.8`
3. `pip install dist/ruff-0.4.8.tar.gz` works
2024-06-14 14:43:04 +01:00
Yair Peretz
89bb07c251 UPDATE latest supported versions to 3.13 (#11870) 2024-06-14 12:35:33 +01:00
Filip Czaplicki
fe462b30e7 Update Python compatibility to 3.13 (#11861)
## Summary

Update Python compatibility to 3.13 in README

## Test Plan
N/A
2024-06-13 19:53:03 -04:00
Micha Reiser
c5bc368e43 [red-knot] Improve Vfs and FileSystem documentation (#11856) 2024-06-13 11:49:27 +00:00
Micha Reiser
73370fe798 Use starts_with('/') instead of is_absolute to avoid platform specific API (#11855) 2024-06-13 12:35:31 +01:00
Micha Reiser
22b6488550 red-knot: Add directory support to MemoryFileSystem (#11825) 2024-06-13 07:48:28 +00:00
Micha Reiser
d4dd96d1f4 red-knot: source_text, line_index, and parsed_module queries (#11822) 2024-06-13 07:37:02 +00:00
Micha Reiser
efbf7b14b5 red-knot[salsa part 2]: Setup semantic DB and Jar (#11837)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-13 08:00:51 +01:00
Dhruv Manilawala
9dc226be97 Add supported commands in server capabilities (#11850)
## Summary

This PR updates the server capabilities to include the commands that
Ruff supports. This is similar to how there's a list of possible code
actions supported by the server.

I noticed this when I was trying to find whether Helix supported
workspace commands or not based on Jane's comment
(https://github.com/astral-sh/ruff/pull/11831#discussion_r1634984921)
and I found the `:lsp-workspace-command` in the editor but it didn't
show up anything in the picker.

So, I looked at the implementation in Helix
(9c479e6d2d/helix-term/src/commands/typed.rs (L1372-L1384))
which made me realize that Ruff doesn't provide this in its
capabilities. Currently, this does require `ruff` to be first in the
list of language servers in the user config but that should be resolved
by https://github.com/helix-editor/helix/pull/10176. So, the following
config should work:

```toml
[[language]]
name = "python"
# Ruff should come first until https://github.com/helix-editor/helix/pull/10176 is released
language-servers = ["ruff", "pyright"]
```

## Test Plan

1. Neovim's server capabilities output should include the supported
commands:

```
  executeCommandProvider = {                                                                                                                          
    commands = { "ruff.applyFormat", "ruff.applyAutofix", "ruff.applyOrganizeImports", "ruff.printDebugInformation" },                                
    workDoneProgress = false                                                                                                                          
  },
```

2. Helix should now display the commands to pick from when
`:lsp-workspace-command` is invoked:

<img width="832" alt="Screenshot 2024-06-13 at 08 47 14"
src="https://github.com/astral-sh/ruff/assets/67177269/09048ecd-c974-4e09-ab56-9482ff3d780b">
2024-06-13 09:32:43 +05:30
Alex Waygood
bcbddac21c Fix Display implementation for typeshed VERSIONS parser (#11848) 2024-06-12 19:56:52 +00:00
Alex Waygood
4ed3aed8d3 [red-knot] Add a parser for typeshed's VERSIONS file (#11836) 2024-06-12 11:44:45 +00:00
Dhruv Manilawala
60ea72a6bc Add list terminator kind for error recovery (#11843)
## Summary

This PR adds a new enum to determine the kind of terminator token i.e.,
is it actually terminates the list or is it used for error recovery.

This is important because the parser should take the error recovery
route in case the terminator token is used for better error recovery.
This will then try to re-lex the token if it's the case.

I haven't updated any reference to use this new enum as otherwise it'll
update the snapshots. I plan to do that in a follow-up PR so that it's
easier to reason about.

## Test plan

`cargo insta test`
2024-06-12 08:33:26 +00:00
Dhruv Manilawala
a525b4be3d Separate terminator token for f-string elements kind (#11842)
## Summary

This PR separates the terminator token for f-string elements depending
on the context. A list of f-string element can occur either in a regular
f-string or a format spec of an f-string. The terminator token is
different depending on that context.

## Test Plan

`cargo insta test` and verify the updated snapshots.
2024-06-12 13:57:35 +05:30
Micha Reiser
93973b96cb red-knot: VfsFile input ingredient and a Vfs (#11802) 2024-06-12 07:06:15 +00:00
Dhruv Manilawala
db8f2c2d9f Use the existing ruff_python_trivia::is_python_whitespace function (#11844)
## Summary

This PR re-uses the `ruff_python_trivia::is_python_whitespace` in the
lexer instead of defining its own. This was mainly to avoid circular
dependency which was resolved in #11261.
2024-06-12 05:59:19 +00:00
Carl Meyer
5c0df7a150 [red-knot] add type narrowing (#11790)
## Summary

Add Constraint nodes to flow graph, and narrow types based on that (only
`is None` and `is not None` narrowing supported for now, to prototype
the structure.)

Also add simplification of zero- and one-element unions and
intersections, and flattening of intersections.

There's a lot more normalization logic needed for unions and
intersections (as is obvious from the inferred type in the added
`narrow_none` test), but this will be non-trivial and I'd rather do it
in a separate PR.

Here's a flowchart diagram for the code in the added `narrow_none` test:

![Screenshot 2024-06-07 at 2 58
00 PM](https://github.com/astral-sh/ruff/assets/61586/5152a400-739c-41ff-8bbf-3c19d16bd083)

The top branch is for the `if` expression in the initial assignment to
`x`; that `Constraint` node would only affect the type of `flag`, which
we don't care about in this test.

The second branch is for the `if` statement, with `Constraint` node
affecting the type of `x`.

## Test Plan

Added tests.
2024-06-12 04:38:50 +00:00
Jane Lewis
7d5cf1811b ruff server: Improve error message when a command is run on an unavailable document (#11823)
## Summary

Fixes #11744.

We now show a distinct popup message when we fail to get a document
snapshot during command execution. This message more clearly
communicates the issue to the user, instead of a generic "ruff
encountered an error" message.

## Test Plan

Try running `Fix all auto-fixable problems` on an incompatible file (for
example: `settings.json`). You should see the following popup message:
<img width="456" alt="Screenshot 2024-06-11 at 11 47 16 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/3a28e3d7-3896-4dd0-b117-f87300dd3b68">
2024-06-11 18:50:01 +00:00
Jane Lewis
4e9d771aa0 ruff server: Introduce the ruff.printDebugInformation command (#11831)
## Summary

Closes #11715.

Introduces a new command, `ruff.printDebugInformation`. This will print
useful information about the status of the server to `stderr`.

Right now, the information shown by this command includes:
* The path to the server executable
* The version of the executable
* The text encoding being used
* The number of open documents and workspaces
* A list of registered configuration files
* The capabilities of the client

## Test Plan

First, checkout and use [the corresponding `ruff-vscode`
PR](https://github.com/astral-sh/ruff-vscode/pull/495).

Running the `Print debug information` command in VS Code should show
something like the following in the Output channel:

<img width="991" alt="Screenshot 2024-06-11 at 11 41 46 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/ab93c009-bb7b-4291-b057-d44fdc6f9f86">
2024-06-11 11:42:46 -07:00
Jane Lewis
507f5c1137 ruff server: Tracing system now respects log level and trace level, with options to log to a file (#11747)
## Summary

Fixes #10968.
Fixes #11545.

The server's tracing system has been rewritten from the ground up. The
server now has trace level and log level settings which restrict the
tracing events and spans that get logged.

* A `logLevel` setting has been added, which lets a user set the log
level. By default, it is set to `"info"`.
* A `logFile` setting has also been added, which lets the user supply an
optional file to send tracing output (it does not have to exist as a
file yet). By default, if this is unset, tracing output will be sent to
`stderr`.
* A `$/setTrace` handler has also been added, and we also set the trace
level from the initialization options. For editors without direct
support for tracing, the environment variable `RUFF_TRACE` can override
the trace level.
* Small changes have been made to how we display tracing output. We no
longer use `tracing-tree`, and instead use
`tracing_subscriber::fmt::Layer` to format output. Thread names are now
included in traces, and I've made some adjustment to thread worker names
to be more useful.

## Test Plan

In VS Code, with `ruff.trace.server` set to its default value, no logs
from Ruff should appear.

After changing `ruff.trace.server` to either `messages` or `verbose`,
you should see log messages at `info` level or higher appear in Ruff's
output:
<img width="1005" alt="Screenshot 2024-06-10 at 10 35 04 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/6050d107-9815-4bd2-96d0-e86f096a57f5">

In Helix, by default, no logs from Ruff should appear.

To set the trace level in Helix, you'll need to modify your language
configuration as follows:
```toml
[language-server.ruff]
command = "/Users/jane/astral/ruff/target/debug/ruff"
args = ["server", "--preview"]
environment = { "RUFF_TRACE" = "messages" }
```

After doing this, logs of `info` level or higher should be visible in
Helix:
<img width="1216" alt="Screenshot 2024-06-10 at 10 39 26 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/8ff88692-d3f7-4fd1-941e-86fb338fcdcc">

You can use `:log-open` to quickly open the Helix log file.

In Neovim, by default, no logs from Ruff should appear.

To set the trace level in Neovim, you'll need to modify your
configuration as follows:
```lua
require('lspconfig').ruff.setup {
  cmd = {"/path/to/debug/executable", "server", "--preview"},
  cmd_env = { RUFF_TRACE = "messages" }
}
```

You should see logs appear in `:LspLog` that look like the following:
<img width="1490" alt="Screenshot 2024-06-11 at 11 24 01 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/576cd5fa-03cf-477a-b879-b29a9a1200ff">

You can adjust `logLevel` and `logFile` in `settings`:
```lua
require('lspconfig').ruff.setup {
  cmd = {"/path/to/debug/executable", "server", "--preview"},
  cmd_env = { RUFF_TRACE = "messages" },
  settings = {
    logLevel = "debug",
    logFile = "your/log/file/path/log.txt"
  }
}
```

The `logLevel` and `logFile` can also be set in Helix like so:
```toml
[language-server.ruff.config.settings]
logLevel = "debug"
logFile = "your/log/file/path/log.txt"
```

Even if this log file does not exist, it should now be created and
written to after running the server:

<img width="1148" alt="Screenshot 2024-06-10 at 10 43 44 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/ab533cf7-d5ac-4178-97f1-e56da17450dd">
2024-06-11 11:29:47 -07:00
Lukas Masuch
4e92102922 Add Streamlit into the Who's Using Ruff? section (#11838)
## Summary

We recently updated our old formatting and linting setup in Streamlit to
use ruff: https://github.com/streamlit/streamlit/pull/8849

Thanks for this excellent tool :) 

## Test Plan

This PR only contains changes to the Readme.
2024-06-11 15:14:23 +00:00
Charlie Marsh
08b548626a Avoid suggesting starmap when arguments are used outside call (#11830)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11810.
2024-06-10 17:10:06 -04:00
Charlie Marsh
0d06900cec Fix isort FAQ to surface correct src setting (#11829)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11722. Based on feedback
in that issue.
2024-06-10 16:33:13 -04:00
Micha Reiser
521a358a4d Set 1.75 as minimal Rust version (#11821)
## Summary

Salsa requires https://github.com/rust-lang/rust/issues/91611 which
stabalized with
[1.75](https://blog.rust-lang.org/2023/12/28/Rust-1.75.0.html)

1.75 doesn't change the supported platforms. So I think this upgrade is
fine in a minor.

## Test Plan

`cargo build`
2024-06-10 08:39:22 -04:00
renovate[bot]
134aa7c7d5 Update dawidd6/action-download-artifact action to v5 (#11818) 2024-06-09 22:23:19 -04:00
renovate[bot]
33e44c25ad Update NPM Development dependencies (#11817) 2024-06-09 21:54:15 -04:00
renovate[bot]
48163dcaca Update dependency uuid to v10 (#11819) 2024-06-10 01:49:58 +00:00
renovate[bot]
e78b9dc7fe Update Rust crate strum_macros to v0.26.4 (#11814)
\
2024-06-09 21:47:56 -04:00
renovate[bot]
141d1a8cdf Update pre-commit dependencies (#11816) 2024-06-10 01:47:48 +00:00
renovate[bot]
8b9bbc0c84 Update Rust crate toml to v0.8.14 (#11815) 2024-06-10 01:47:45 +00:00
renovate[bot]
87ea06e360 Update Rust crate regex to v1.10.5 (#11813) 2024-06-10 01:47:23 +00:00
renovate[bot]
74246f4acc Update Rust crate clap to v4.5.6 (#11812) 2024-06-10 01:47:06 +00:00
Gilles Peiffer
b3b2f57d8e [pylint] Fix flag name in too-many-public-methods (PLR0904) (#11809) 2024-06-09 19:44:12 -04:00
Dhruv Manilawala
549cc1e437 Build CommentRanges outside the parser (#11792)
## Summary

This PR updates the parser to remove building the `CommentRanges` and
instead it'll be built by the linter and the formatter when it's
required.

For the linter, it'll be built and owned by the `Indexer` while for the
formatter it'll be built from the `Tokens` struct and passed as an
argument.

## Test Plan

`cargo insta test`
2024-06-09 09:55:17 +00:00
Philipp Thiel
7509a48eab Adapted fix to work identical to format (#10999)
## Summary

The fix for E203 now produces the same result as ruff format in cases
where a slice ends on a colon and the closing square bracket is on the
following line.

Refers to https://github.com/astral-sh/ruff/issues/10973

## Test Plan

The minimal reproduction case in the ticket was added as test case
producing no error. Additional cases with multiple spaces or a tab
before the colon where added to make sure that the rule still finds
these.
2024-06-08 19:29:18 -04:00
Alex Waygood
af821ecda1 Fix TypeVarTuple typo in pyupgrade rule (#11806) 2024-06-08 22:47:55 +00:00
Aleksei Latyshev
ccc418cc49 [refurb] Implement repeated-global (FURB154) (#11187)
Implement repeated_global (FURB154) lint.
See:
- https://github.com/astral-sh/ruff/issues/1348
- [original
lint](https://github.com/dosisod/refurb/blob/master/refurb/checks/builtin/simplify_global_and_nonlocal.py)

## Test Plan
cargo test
2024-06-08 20:35:40 +00:00
Charlie Marsh
b98ab1b0b6 Add isort standard-library distinction to FAQ (#11804)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11726.
2024-06-08 16:10:50 -04:00
aditya pillai
ed947792cf Handle non-printable characters in diff view (#11687)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-06-08 06:22:03 +00:00
Charlie Marsh
ee1621b2f9 Use real file path when available in ruff server (#11800)
## Summary

As-is, we're using the URL path for all files, leading us to use paths
like:

```
/c%3A/Users/crmar/workspace/fastapi/tests/main.py
```

This doesn't match against per-file ignores and other patterns in Ruff
configuration.

This PR modifies the LSP to use the real file path if available, and the
virtual file path if not.

Closes https://github.com/astral-sh/ruff/issues/11751.

## Test Plan

Ran the LSP on Windows. In the FastAPI repo, added:

```toml
[tool.ruff.lint.per-file-ignores]
"tests/**/*.py" = ["F401"]
```

And verified that an unused import was ignored in `tests` after this
change, but not before.
2024-06-07 22:48:53 -07:00
Micha Reiser
32ca704956 Rename PreorderVisitor to SourceOrderVisitor (#11798)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-07 17:01:58 +00:00
Alex Waygood
37d8de3316 [red-knot] Include vendored typeshed stubs as a zipfile in the Ruff binary (#11779)
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Carl Meyer <carl@astral.sh>
2024-06-07 15:00:36 +00:00
Carl Meyer
4157c8635b [red-knot] add None type (#11788)
Add type for None.
2024-06-07 08:40:22 -06:00
Dhruv Manilawala
d22f3402e1 Remove result_like dependency (#11793)
## Summary

This PR removes the `result-like` dependency and instead implement the
required functionality. The motivation being that `noqa.is_enabled()` is
easier to read than `noqa.into()`.

For context, I was just trying to understand the syntax error workflow
and I saw these flags which were being converted via `into`. I always
find `into` confusing because you never know what's it being converted
into unless you know the type. Later realized that it's just a boolean
flag. After removing the usages from these two flags, it turns out that
the dependency is only being used in one rule so I thought to remove
that as well.

## Test Plan

`cargo insta test`
2024-06-07 11:53:22 +05:30
Embers-of-the-Fire
ea27445479 [refurb] Fix misbehavior of operator.itemgetter when getter param is a tuple (#11774) 2024-06-07 03:10:52 +00:00
Carl Meyer
540d76892f [red-knot] remove duplicate test from bad merge (#11787)
Somehow a merge of a PR that had all-green CI duplicated this test when
it merged into main, breaking the build.
2024-06-06 22:40:19 +00:00
Carl Meyer
cd101c83ae [red-knot] condense int literals (#11784)
Display `(Literal[1] | Literal[2])` as `Literal[1, 2]`, and `(Literal[1]
| Literal[2] | OtherType)` as `(Literal[1, 2] | OtherType)`.

Fixes #11782

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-06 16:30:40 -06:00
Carl Meyer
b2fc0df6db [red-knot] flatten unions (#11783)
Flatten union types. Fixes #11781
2024-06-06 16:13:40 -06:00
Alex Waygood
93eefb1417 [red-knot] Cleanup module-resolution logic in module.rs (#11777) 2024-06-06 17:33:02 +01:00
Alex Waygood
303ef02f93 [red-knot] Encapsulate module resolution logic in module.rs (#11767) 2024-06-06 14:31:09 +00:00
Dhruv Manilawala
1b7d08c2c9 Consider : to terminate parenthesized with items (#11775)
## Summary

This PR is a follow-up to this discussion
(https://github.com/astral-sh/ruff/pull/11770#discussion_r1628917209)
which adds the `:` token in the terminator set for parenthesized with
items.

The main motivation is to avoid parsing too much in speculative mode.
This is evident with the following _before_ and _after_ parsed with
items list for the following code:

```py
with (item1, item2:
    foo
```

<table>
  <tr>
    <th>Before (3 items)</th>
    <th>After (2 items)</th>
  </tr>
  <tr>
    <td>
<pre>
parsed_with_items: [
    ParsedWithItem {
        item: WithItem {
            range: 6..11,
            context_expr: Name(
                ExprName {
                    range: 6..11,
                    id: "item1",
                    ctx: Load,
                },
            ),
            optional_vars: None,
        },
        is_parenthesized: false,
    },
    ParsedWithItem {
        item: WithItem {
            range: 13..18,
            context_expr: Name(
                ExprName {
                    range: 13..18,
                    id: "item2",
                    ctx: Load,
                },
            ),
            optional_vars: None,
        },
        is_parenthesized: false,
    },
    ParsedWithItem {
        item: WithItem {
            range: 24..27,
            context_expr: Name(
                ExprName {
                    range: 24..27,
                    id: "foo",
                    ctx: Load,
                },
            ),
            optional_vars: None,
        },
        is_parenthesized: false,
    },
]
</pre>
	</td>
    <td>
<pre>
parsed_with_items: [
    ParsedWithItem {
        item: WithItem {
            range: 6..11,
            context_expr: Name(
                ExprName {
                    range: 6..11,
                    id: "item1",
                    ctx: Load,
                },
            ),
            optional_vars: None,
        },
        is_parenthesized: false,
    },
    ParsedWithItem {
        item: WithItem {
            range: 13..18,
            context_expr: Name(
                ExprName {
                    range: 13..18,
                    id: "item2",
                    ctx: Load,
                },
            ),
            optional_vars: None,
        },
        is_parenthesized: false,
    },
]
</pre>
	</td>
  </tr>
</table>

## Test Plan

`cargo insta test`
2024-06-06 18:40:44 +05:30
Carl Meyer
fcaa62f0d9 [red-knot] support if-expressions in type inference and CFG (#11765) 2024-06-06 04:40:44 -06:00
Embers-of-the-Fire
f144edeefa [Bug fix] Fix rule B909's panic when checking large loop blocks (#11772) 2024-06-06 12:23:28 +02:00
Dhruv Manilawala
6c1fa1d440 Use speculative parsing for with-items (#11770)
## Summary

This PR updates the with-items parsing logic to use speculative parsing
instead.

### Existing logic

First, let's understand the previous logic:
1. The parser sees `(`, it doesn't know whether it's part of a
parenthesized with items or a parenthesized expression
2. Consider it a parenthesized with items and perform a hand-rolled
speculative parsing
3. Then, verify the assumption and if it's incorrect convert the parsed
with items into an appropriate expression which becomes part of the
first with item

Here, in (3) there are lots of edge cases which we've to deal with:
1. Trailing comma with a single element should be [converted to the
expression as
is](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2140-L2153))
2. Trailing comma with multiple elements should be [converted to a tuple
expression](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2155-L2178))
3. Limit the allowed expression based on whether it's
[(1)](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2144-L2152))
or
[(2)](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2157-L2171))
4. [Consider postfix
expressions](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2181-L2200))
after (3)
5. [Consider `if`
expressions](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2203-L2208))
after (3)
6. [Consider binary
expressions](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2210-L2228))
after (3)

Consider other cases like
* [Single generator
expression](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2020-L2035))
* [Expecting a
comma](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2122-L2130))

And, this is all possible only if we allow parsing these expressions in
the [with item parsing
logic](9b2cf569b2/crates/ruff_python_parser/src/parser/statement.rs (L2287-L2334)).

### Speculative parsing

With #11457 merged, we can simplify this logic by changing the step (3)
from above to just rewind the parser back to the `(` if our assumption
(parenthesized with-items) was incorrect and then continue parsing it
considering parenthesized expression.

This also behaves a lot similar to what a PEG parser does which is to
consider the first grammar rule and if it fails consider the second
grammar rule and so on.

resolves: #11639 

## Test Plan

- [x] Verify the updated snapshots
- [x] Run the fuzzer on around 3000 valid source code (locally)
2024-06-06 08:59:56 +00:00
Max Muoto
5a5a588a72 [pylint] Implement dict-iter-missing-items (C0206) (#11688)
## Summary

<!-- What's the purpose of the change? What does it do, and why? -->

This PR implements the [consider dict
items](https://pylint.pycqa.org/en/latest/user_guide/messages/convention/consider-using-dict-items.html)
rule from Pylint. Enabling this rule flags:

```python
ORCHESTRA = {
    "violin": "strings",
    "oboe": "woodwind",
    "tuba": "brass",
    "gong": "percussion",
}


for instrument in ORCHESTRA: 
    print(f"{instrument}: {ORCHESTRA[instrument]}")

for instrument in ORCHESTRA.keys(): 
    print(f"{instrument}: {ORCHESTRA[instrument]}")

for instrument in (inline_dict := {"foo": "bar"}): 
    print(f"{instrument}: {inline_dict[instrument]}")
```

For not using `items()` to extract the value out of the dict. We ignore
the case of an assignment, as you can't modify the underlying
representation with the value in the list of tuples returned.
 

## Test Plan

<!-- How was it tested? -->

`cargo test`.

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-06-06 00:28:01 -04:00
Carl Meyer
084e5464fb [red-knot] support walrus expressions in type inference (#11762)
## Summary

Add support for walrus expressions, both in expression type inference
and in symbol definition type inference.

## Test Plan

Added test.
2024-06-05 15:13:10 -06:00
Carl Meyer
31f97329c0 [red-knot] refactor Definitions out of symbol table (#11761)
## Summary

Definitions are used in symbol table and in flow graph, and aren't
inherently owned by one or the other; move them into their own
submodule.

## Test Plan

Existing tests.
2024-06-05 14:35:01 -06:00
Carl Meyer
b46e9e825a [red-knot] arithmetic on int literals (#11760)
## Summary

Add support for inferring int literal types from basic arithmetic on int
literals. Just to begin showing examples of resolving more complex
expression types, and because this will be useful in testing walrus
expressions.

## Test Plan

Added test.
2024-06-05 14:10:37 -06:00
Carl Meyer
9b2cf569b2 [red-knot] rename Definition::None to Definition::Unbound (#11758)
## Summary

After looking at this a bit, I think it does make sense to have
`Unbound` as part of the `Definition` enum; if we are modeling `Unbound`
as a type (which currently we are), then every symbol implicitly starts
each scope with a "definition" as unbound, and the cleanest way to model
that is as a real `Definition`. We should be able to handle a definition
of "unbound" anywhere we handle definitions.

But the name `None` wasn't clear enough; changing the name to `Unbound`
and adding a doc comment.

Also change `[first].into_iter()` to `std::iter::once(first)`, from
post-land code review on a prior PR.

## Test Plan

Existing tests.
2024-06-05 11:32:26 -06:00
Micha Reiser
5806bc915d Fix formatter instability for lines only consisting of zero-width characters (#11748) 2024-06-05 17:55:14 +02:00
Micha Reiser
b0b4706e2d Red-knot: Track scopes per expression (#11754) 2024-06-05 17:53:26 +02:00
Dhruv Manilawala
a8cf7096ff Bump version to v0.4.8 (#11755)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-05 20:51:31 +05:30
Carl Meyer
895eb3ef48 [red-knot] refactor CFG outside of symbol table (#11746) 2024-06-05 06:23:43 -06:00
Dhruv Manilawala
2e0a9755e0 Disallow access to Parsed output, use the API instead (#11741)
## Summary

This PR is a follow-up to #11740 to restrict access to the `Parsed`
output by replacing the `parsed` API function with a more specific one.
Currently, that is `comment_ranges` but the linked PR exposes a `tokens`
method.

The main motivation is so that there's no way to get an incorrect
information from the checker. And, it also encapsulates the source of
the comment ranges and the tokens itself. This way it would become
easier to just update the checker if the source for these information
changes in the future.

## Test Plan

`cargo insta test`
2024-06-05 08:24:19 +00:00
Dhruv Manilawala
b021b5babe Use Tokens from parsed type annotation or parsed source (#11740)
## Summary

This PR fixes a bug where the checker would require the tokens for an
invalid offset w.r.t. the source code.

Taking the source code from the linked issue as an example:
```py
relese_version :"0.0is 64"
```

Now, this isn't really a valid type annotation but that's what this PR
is fixing. Regardless of whether it's valid or not, Ruff shouldn't
panic.

The checker would visit the parsed type annotation (`0.0is 64`) and try
to detect any violations. Certain rule logic requests the tokens for the
same but it would fail because the lexer would only have the `String`
token considering original source code. This worked before because the
lexer was invoked again for each rule logic.

The solution is to store the parsed type annotation on the checker if
it's in a typing context and use the tokens from that instead if it's
available. This is enforced by creating a new API on the checker to get
the tokens.

But, this means that there are two ways to get the tokens via the
checker API. I want to restrict this in a follow-up PR (#11741) to only
expose `tokens` and `comment_ranges` as methods and restrict access to
the parsed source code.

fixes: #11736 

## Test Plan

- [x] Add a test case for `F632` rule and update the snapshot
- [x] Check all affected rules
- [x] No ecosystem changes
2024-06-05 07:50:33 +00:00
Dhruv Manilawala
eed6d784df Update type annotation parsing API to return Parsed (#11739)
## Summary

This PR updates the return type of `parse_type_annotation` from `Expr`
to `Parsed<ModExpression>`. This is to allow accessing the tokens for
the parsed sub-expression in the follow-up PR.

## Test Plan

`cargo insta test`
2024-06-05 12:59:43 +05:30
Jane Lewis
8338db6c12 ruff server: Formatting a document with syntax problems no longer spams a visible error popup (#11745)
## Summary

Fixes https://github.com/astral-sh/ruff-vscode/issues/482.

I've made adjustments to `format` and `format_range` that handle parsing
errors before they become server errors. We'll still log this as a
problem, but there will no longer be a visible popup.

## Test Plan

Instead of seeing a visible error when formatting a document with syntax
issues, you should see this warning in the LSP logs:

<img width="991" alt="Screenshot 2024-06-04 at 3 38 23 PM"
src="https://github.com/astral-sh/ruff/assets/19577865/9d68947d-6462-4ca6-ab5a-65e573c91db6">

Similarly, if you try to format a range with syntax issues, you should
see this warning in the LSP logs instead of a visible error popup:

<img width="1010" alt="Screenshot 2024-06-04 at 3 39 10 PM"
src="https://github.com/astral-sh/ruff/assets/19577865/99fff098-798d-406a-976e-81ead0da0352">

---------

Co-authored-by: Zanie Blue <contact@zanie.dev>
2024-06-04 17:18:21 -07:00
Carl Meyer
d056d09547 [red-knot] add if-statement support to FlowGraph (#11673)
## Summary

Add if-statement support to FlowGraph. This introduces branches and
joins in the graph for the first time.

## Test Plan

Added tests.
2024-06-04 15:09:39 -06:00
Mateusz Sokół
1645be018d Update NPY001 rule for NumPy 2.0 (#11735)
Hi!

This PR addresses https://github.com/astral-sh/ruff/issues/11093.

It skips `np.bool` and `np.long` replacements as both of these names
were reintroduced in NumPy 2.0 with a different meaning
(https://github.com/numpy/numpy/pull/24922,
https://github.com/numpy/numpy/pull/25080).
With this change `NPY001` will no longer conflict with `NPY201`. For
projects using NumPy 1.x `np.bool` and `np.long` has been deprecated and
removed long time ago, and accessing them yields an informative error
message.
2024-06-04 19:23:42 +00:00
Michael Oultram
2c865023ac CI: add job to run tests under minimum supported rust version (msrv) (#11737)
## Summary

This change adds a GitHub Actions CI job to check that the project
builds and test pass under the declared minimum supported rust compiler.
I have bumped the msrv to 1.74 as that is the lowest version I could get
this project to build on.

## Test Plan

The CI job has run on this PR, and will also run on the main branch.
2024-06-04 15:14:50 -04:00
Dhruv Manilawala
2567e14b7a Lexer should consider BOM for the start offset (#11732)
## Summary

This PR fixes a bug where the lexer didn't consider the BOM into the
start offset.

fixes: #11731

## Test Plan

Add multiple test cases which involves BOM character in the source for
the lexer and verify the snapshot.
2024-06-04 08:45:46 +00:00
Dhruv Manilawala
3b19df04d7 Use cursor offset for lexer checkpoint (#11734)
## Summary

This PR updates the lexer checkpoint to store the cursor offset instead
of cloning the cursor itself. This reduces the size of `LexerCheckpoint`
from 136 to 112 bytes and also removes the need for lifetime.

## Test Plan

`cargo insta test`
2024-06-04 14:13:57 +05:30
Micha Reiser
6ffb96171a red-knot: Change resolve_global_symbol to take Module as an argument (#11723) 2024-06-04 06:20:50 +00:00
Micha Reiser
64165bee43 red-knot: Use parse_unchecked to get all parse errors (#11725) 2024-06-04 06:04:48 +00:00
Charlie Marsh
0c75548146 Respect per-file ignores for blanket and redirected noqa rules (#11728)
## Summary

Ensures that we respect per-file ignores and exemptions for these rules.
Specifically, we allow:

```python
# ruff: noqa: PGH004
```

...to ignore `PGH004`.
2024-06-04 03:57:59 +00:00
Alex
b56a577f25 [pygrep_hooks] Check blanket ignores via file-level pragmas (PGH004) (#11540)
## Summary

Should resolve https://github.com/astral-sh/ruff/issues/11454.

This is my first PR to `ruff`, so I may have missed something.

If I understood the suggestion in the issue correctly, rule `PGH004`
should be set to `Preview` again.

## Test Plan

Created two fixtures derived from the issue.
2024-06-04 03:42:58 +00:00
Tushar Sadhwani
e1133a24ed [flake8-pyi] Implement PYI063 (#11699)
## Summary
Implements `Y063` from `flake8-pyi`.

## Test Plan
`cargo test` / `cargo insta review`
2024-06-04 03:15:04 +00:00
Charlie Marsh
2f8ac1e9b3 Fix red-knot compilation (#11727)
## Summary

Perhaps a result of a bad rebase, but `cargo clippy --fix --workspace
--all-targets -- -D warnings` does not pass on main as-is.
2024-06-04 03:03:38 +00:00
Carl Meyer
3fb2028506 [red-knot] extract helper functions in inference tests (#11671)
There's a lot of repeat boilerplate in the type inference tests; this
cuts it down a lot.
2024-06-03 17:46:04 -06:00
Carl Meyer
3f9ee31efb [red-knot] use reachable definitions in infer_expression_type (#11670)
## Summary

Switch name resolution in `infer_expression_type` from resolving the
public type of a symbol, to resolving the reachable definitions of that
symbol from the reference point, using the flow graph.

This surfaced a bug in the flow graph implementation and a bug in symbol
table building, both of which are also fixed here.

The bug in flow graph implementation was that when we pushed and popped
scopes, we didn't maintain a stack of "current flow nodes" in all
stacked scopes, to be restored when we returned to that scope. Now we
do.

The bug in symbol table building that we didn't visit the parts of
functions and class definitions in the correct scopes. E.g. decorators
should be visited in the outer scope, arguments should be visited inside
the type-params scope (if any) but not inside the function body scope,
and only the body itself should actually be visited inside the body
scope. Fixing this requires that we no longer use `walk_stmt` here,
instead we have to visit each individual component.

## Test Plan

Added test.
2024-06-03 17:45:31 -06:00
Carl Meyer
b02d3f3fd9 [red-knot] infer_symbol_public_type infers union of all definitions (#11669)
## Summary

Rename `infer_symbol_type` to `infer_symbol_public_type`, and allow it
to work on symbols with more than one definition. For now, use the most
cautious/sound inference, which is the union of all definitions. We can
prune this union more in future by eliminating definitions if we can
show that they can't be visible (this requires both that the symbol is
definitely later reassigned, and that there is no intervening
call/import that might be able to see the over-written definition).

## Test Plan

Added a test showing inference of union from multiple definitions.
2024-06-03 17:27:06 -06:00
Dhruv Manilawala
2b28889ca9 Isolate non-breaking whitespace indentation test case (#11721)
As discussed in Discord, this moves the test case for non-breaking
whitespace into its own method.
2024-06-03 13:20:55 +00:00
Dhruv Manilawala
8db147c09d Generator should add a newline before type statement (#11720)
## Summary

This PR fixes a bug where the `Generator` wouldn't add a newline before
a type alias statement. This is because it wasn't using the `statement`
macro which takes care of the newline.

Without this fix, a code like:
```py
type X = int
type Y = str
```

The generator would produce:
```py
type X = inttype Y = str
```

## Test Plan

Add a test case.
2024-06-03 18:44:21 +05:30
Dhruv Manilawala
a58bde6958 Remove less used parser dependencies (#11718)
## Summary

This PR removes the following dependencies from the `ruff_python_parser`
crate:
* `anyhow` (moved to dev dependencies)
* `is-macro`
* `itertools`

The main motivation is that they aren't used much.

Additionally, it updates the return type of `parse_type_annotation` to
use a more specific `ParseError` instead of the generic `anyhow::Error`.

## Test Plan

`cargo insta test`
2024-06-03 13:08:24 +00:00
Dhruv Manilawala
f4e23d2dff Use string expression for parsing type annotation (#11717)
## Summary

This PR updates the logic for parsing type annotation to accept a
`ExprStringLiteral` node instead of the string value and the range.

The main motivation of this change is to simplify the implementation of
`parse_type_annotation` function with:
* Use the `opener_len` and `closer_len` from the string flags to get the
raw contents range instead of extracting it via
	* `str::leading_quote(expression).unwrap().text_len()`
	* `str::trailing_quote(expression).unwrap().text_len()`
* Avoid comparing the string content if we already know that it's
implicitly concatenated

## Test Plan

`cargo insta test`
2024-06-03 13:04:03 +00:00
Dhruv Manilawala
4a155e2b22 Re-order lexer methods (#11716)
## Summary

This PR re-orders the lexer methods in the following order:

1. `next_token`
2. `lex_token`
3. `eat_indentation`
4. `handle_indentation`
5. `skip_whitespace`
6. `consume_ascii_character`
7. `try_single_char_prefix`
8. `try_double_char_prefix`
9. `lex_identifier`
10. `lex_fstring_start`
11. `lex_fstring_middle_or_end`
12. `lex_string`
13. `lex_number`
14. `lex_number_radix`
15. `lex_decimal_number`
16. `radix_run`
17. `lex_comment`
18. `lex_ipython_escape_command`
19. `consume_end`

Following was considered for the ordering:
* 1 is the main entry point which delegates to 2
* 3, 4, 5 are all related to whitespace which is done first
* 6 is the entrypoint for an ascii character which delegates to 9, 12,
13, 17, 18, 19
* Others are grouped around similar kind of methods
2024-06-03 12:58:35 +00:00
Dhruv Manilawala
bf5b62edac Maintain synchronicity between the lexer and the parser (#11457)
## Summary

This PR updates the entire parser stack in multiple ways:

### Make the lexer lazy

* https://github.com/astral-sh/ruff/pull/11244
* https://github.com/astral-sh/ruff/pull/11473

Previously, Ruff's lexer would act as an iterator. The parser would
collect all the tokens in a vector first and then process the tokens to
create the syntax tree.

The first task in this project is to update the entire parsing flow to
make the lexer lazy. This includes the `Lexer`, `TokenSource`, and
`Parser`. For context, the `TokenSource` is a wrapper around the `Lexer`
to filter out the trivia tokens[^1]. Now, the parser will ask the token
source to get the next token and only then the lexer will continue and
emit the token. This means that the lexer needs to be aware of the
"current" token. When the `next_token` is called, the current token will
be updated with the newly lexed token.

The main motivation to make the lexer lazy is to allow re-lexing a token
in a different context. This is going to be really useful to make the
parser error resilience. For example, currently the emitted tokens
remains the same even if the parser can recover from an unclosed
parenthesis. This is important because the lexer emits a
`NonLogicalNewline` in parenthesized context while a normal `Newline` in
non-parenthesized context. This different kinds of newline is also used
to emit the indentation tokens which is important for the parser as it's
used to determine the start and end of a block.

Additionally, this allows us to implement the following functionalities:
1. Checkpoint - rewind infrastructure: The idea here is to create a
checkpoint and continue lexing. At a later point, this checkpoint can be
used to rewind the lexer back to the provided checkpoint.
2. Remove the `SoftKeywordTransformer` and instead use lookahead or
speculative parsing to determine whether a soft keyword is a keyword or
an identifier
3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted
by the lexer but it contains owned data which makes it expensive to
clone. The new `TokenKind` enum just represents the type of token which
is very cheap.

This brings up a question as to how will the parser get the owned value
which was stored on `Tok`. This will be solved by introducing a new
`TokenValue` enum which only contains a subset of token kinds which has
the owned value. This is stored on the lexer and is requested by the
parser when it wants to process the data. For example:
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)

[^1]: Trivia tokens are `NonLogicalNewline` and `Comment`

### Remove `SoftKeywordTransformer`

* https://github.com/astral-sh/ruff/pull/11441
* https://github.com/astral-sh/ruff/pull/11459
* https://github.com/astral-sh/ruff/pull/11442
* https://github.com/astral-sh/ruff/pull/11443
* https://github.com/astral-sh/ruff/pull/11474

For context,
https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220
added support for soft keywords in the parser which uses infinite
lookahead to classify a soft keyword as a keyword or an identifier. This
is a brilliant idea as it basically wraps the existing Lexer and works
on top of it which means that the logic for lexing and re-lexing a soft
keyword remains separate. The change here is to remove
`SoftKeywordTransformer` and let the parser determine this based on
context, lookahead and speculative parsing.

* **Context:** The transformer needs to know the position of the lexer
between it being at a statement position or a simple statement position.
This is because a `match` token starts a compound statement while a
`type` token starts a simple statement. **The parser already knows
this.**
* **Lookahead:** Now that the parser knows the context it can perform
lookahead of up to two tokens to classify the soft keyword. The logic
for this is mentioned in the PR implementing it for `type` and `match
soft keyword.
* **Speculative parsing:** This is where the checkpoint - rewind
infrastructure helps. For `match` soft keyword, there are certain cases
for which we can't classify based on lookahead. The idea here is to
create a checkpoint and keep parsing. Based on whether the parsing was
successful and what tokens are ahead we can classify the remaining
cases. Refer to #11443 for more details.

If the soft keyword is being parsed in an identifier context, it'll be
converted to an identifier and the emitted token will be updated as
well. Refer
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491).

The `case` soft keyword doesn't require any special handling because
it'll be a keyword only in the context of a match statement.

### Update the parser API

* https://github.com/astral-sh/ruff/pull/11494
* https://github.com/astral-sh/ruff/pull/11505

Now that the lexer is in sync with the parser, and the parser helps to
determine whether a soft keyword is a keyword or an identifier, the
lexer cannot be used on its own. The reason being that it's not
sensitive to the context (which is correct). This means that the parser
API needs to be updated to not allow any access to the lexer.

Previously, there were multiple ways to parse the source code:
1. Passing the source code itself
2. Or, passing the tokens

Now that the lexer and parser are working together, the API
corresponding to (2) cannot exists. The final API is mentioned in this
PR description: https://github.com/astral-sh/ruff/pull/11494.

### Refactor the downstream tools (linter and formatter)

* https://github.com/astral-sh/ruff/pull/11511
* https://github.com/astral-sh/ruff/pull/11515
* https://github.com/astral-sh/ruff/pull/11529
* https://github.com/astral-sh/ruff/pull/11562
* https://github.com/astral-sh/ruff/pull/11592

And, the final set of changes involves updating all references of the
lexer and `Tok` enum. This was done in two-parts:
1. Update all the references in a way that doesn't require any changes
from this PR i.e., it can be done independently
	* https://github.com/astral-sh/ruff/pull/11402
	* https://github.com/astral-sh/ruff/pull/11406
	* https://github.com/astral-sh/ruff/pull/11418
	* https://github.com/astral-sh/ruff/pull/11419
	* https://github.com/astral-sh/ruff/pull/11420
	* https://github.com/astral-sh/ruff/pull/11424
2. Update all the remaining references to use the changes made in this
PR

For (2), there were various strategies used:
1. Introduce a new `Tokens` struct which wraps the token vector and add
methods to query a certain subset of tokens. These includes:
	1. `up_to_first_unknown` which replaces the `tokenize` function
2. `in_range` and `after` which replaces the `lex_starts_at` function
where the former returns the tokens within the given range while the
latter returns all the tokens after the given offset
2. Introduce a new `TokenFlags` which is a set of flags to query certain
information from a token. Currently, this information is only limited to
any string type token but can be expanded to include other information
in the future as needed. https://github.com/astral-sh/ruff/pull/11578
3. Move the `CommentRanges` to the parsed output because this
information is common to both the linter and the formatter. This removes
the need for `tokens_and_ranges` function.

## Test Plan

- [x] Update and verify the test snapshots
- [x] Make sure the entire test suite is passing
- [x] Make sure there are no changes in the ecosystem checks
- [x] Run the fuzzer on the parser
- [x] Run this change on dozens of open-source projects

### Running this change on dozens of open-source projects

Refer to the PR description to get the list of open source projects used
for testing.

Now, the following tests were done between `main` and this branch:
1. Compare the output of `--select=E999` (syntax errors)
2. Compare the output of default rule selection
3. Compare the output of `--select=ALL`

**Conclusion: all output were same**

## What's next?

The next step is to introduce re-lexing logic and update the parser to
feed the recovery information to the lexer so that it can emit the
correct token. This moves us one step closer to having error resilience
in the parser and provides Ruff the possibility to lint even if the
source code contains syntax errors.
2024-06-03 18:23:50 +05:30
renovate[bot]
c69a789aa5 Update NPM Development dependencies (#11713) 2024-06-03 01:59:07 +00:00
renovate[bot]
140c408a92 Update pre-commit dependencies (#11712) 2024-06-02 21:51:42 -04:00
renovate[bot]
27085a93d9 Update cloudflare/wrangler-action action to v3.6.1 (#11709) 2024-06-02 21:51:27 -04:00
renovate[bot]
a9b6c4f269 Update dependency monaco-editor to ^0.49.0 (#11710) 2024-06-02 21:51:23 -04:00
renovate[bot]
ded010cf9c Update Rust crate tracing-tree to v0.3.1 (#11703) 2024-06-02 21:51:13 -04:00
renovate[bot]
436dc18b15 Update Rust crate libcst to v1.4.0 (#11707) 2024-06-03 01:05:32 +00:00
renovate[bot]
9599bd7622 Update Rust crate itertools to 0.13.0 (#11706) 2024-06-03 01:05:17 +00:00
renovate[bot]
ec3f523924 Update Rust crate insta to v1.39.0 (#11705) 2024-06-03 01:04:26 +00:00
renovate[bot]
010434015e Update Rust crate proc-macro2 to v1.0.85 (#11700) 2024-06-03 01:03:31 +00:00
renovate[bot]
25131da2c3 Update Rust crate toml to v0.8.13 (#11702) 2024-06-02 21:03:09 -04:00
renovate[bot]
712783825d Update Rust crate strum_macros to v0.26.3 (#11701) 2024-06-02 21:03:03 -04:00
Alex Waygood
94a3c53841 Update UP035 for Python 3.13 and the latest version of typing_extensions (#11693) 2024-06-02 22:59:48 +01:00
Tobias Fischer
0ea2519e80 Add RDJson support. (#11682)
## Summary

Implement support for RDJson output for `ruff check`, as requested in
#8655.

## Test Plan

Tested using a snapshot test. Same approach as for e.g. the JSON output
formatter.

## Additional info

I tried to keep the implementation close to the JSON implementation.

I had to deviate a bit to make the `suggestions` key work: If there are
no suggestions, then setting `suggestions` to `null` is invalid
according to the JSONSchema. Therefore, I opted for a slightly more
complex implementation, that skips the `suggestions` key entirely if
there are no fixes available for the given diagnostic. Maybe it would
have been easier to set `"suggestions": []`, but I ended up doing it
this way.

I didn't consider notebooks, as I _think_ that RDJson doesn't work with
notebooks. This should be confirmed, and if so, there should be some
form of warning or error emitted when trying to output diagnostics for a
notebook.

I also didn't consider `ruff format`, as this comment:
https://github.com/astral-sh/ruff/issues/8655#issuecomment-1811446160
suggests that that wouldn't be compatible.

I'm new to Rust, any feedback is appreciated. 🙂 I
implemented this in order to have a productive rainy saturday afternoon,
I'm not knowledgeable about RDJson beyond the sources linked in the
issue.
2024-06-02 17:59:57 +00:00
Charlie Marsh
6d79ddc0aa [pyupgrade] Write empty string in lieu of panic (#11696)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11692.
2024-06-02 17:51:03 +00:00
Alex Waygood
9f3e609278 Make tests aware that py313 is the latest supported Python version (#11690) 2024-06-02 13:06:04 +00:00
Charlie Marsh
b36dd1aa51 [flake8-simplify] Simplify double negatives in SIM103 (#11684)
## Summary

Closes: https://github.com/astral-sh/ruff/issues/11685.
2024-06-01 23:21:11 +00:00
Charlie Marsh
fd9d68051e Update CHANGELOG.md (#11683) 2024-06-01 18:08:02 +00:00
github-actions[bot]
99834ee93d Sync vendored typeshed stubs (#11668)
Close and reopen this PR to trigger CI

Co-authored-by: typeshedbot <>
2024-05-31 22:26:20 -06:00
Charlie Marsh
b80bf22c4d Omit red-knot PRs from the changelog (#11666)
## Summary

This just ensures that PRs labelled with `red-knot` are automatically
filtered out from the auto-generated changelog (which we then manually
finalize anyway).
2024-05-31 19:18:53 -04:00
Tobias Fischer
312f6640b8 [flake8-bugbear] Implement return-in-generator (B901) (#11644)
## Summary

This PR implements the rule B901, which is part of the opinionated rules
of `flake8-bugbear`.

This rule seems to be desired in `ruff` as per
https://github.com/astral-sh/ruff/issues/3758 and
https://github.com/astral-sh/ruff/issues/2954#issuecomment-1441162976.

## Test Plan

As this PR was made closely following the
[CONTRIBUTING.md](8a25531a71/CONTRIBUTING.md),
it tests using the snapshot approach, that is described there.

## Sources

The implementation is inspired by [the original implementation in the
`flake8-bugbear`
repository](d1aec4cbef/bugbear.py (L1092)).
The error message and [test
file](d1aec4cbef/tests/b901.py)
where also copied from there.

The documentation I came up with on my own and needs improvement. Maybe
the example given in
https://github.com/astral-sh/ruff/issues/2954#issuecomment-1441162976
could be used, but maybe they are too complex, I'm not sure.

## Open Questions

- [ ] Documentation. (See above.)

- [x] Can I access the parent in a visitor?

The [original
implementation](d1aec4cbef/bugbear.py (L1100))
references the `yield` statement's parent to check if it is an
expression statement. I didn't find a way to do this in `ruff` and used
the `is_expresssion_statement` field on the visitor instead. What are
your thoughts on this? Is it possible and / or desired to access the
parent node here?

- [x] Is `Option::is_some(...)` -> `...unwrap()` the right thing to do?

Referring to [this piece of
code](9d5a280f71/crates/ruff_linter/src/rules/flake8_bugbear/rules/return_x_in_generator.rs (L91-L96)).
From my understanding, the `.unwrap()` is safe, because it is checked
that `return_` is not `None`. However, I feel like I missed a more
elegant solution that does both in one.

## Other

I don't know a lot about this rule, I just implemented it because I
found it in a
https://github.com/astral-sh/ruff/labels/good%20first%20issue.

I'm new to Rust, so any constructive critisism is appreciated.

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-05-31 21:48:36 +00:00
Charlie Marsh
91a5fdee7a Use find in indent detection (#11650) 2024-05-31 20:35:19 +00:00
Charlie Marsh
1ad5f9c038 Bump version to v0.4.7 (#11646) 2024-05-31 16:30:36 -04:00
plredmond
e914bc300b F401 sort bindings before adding to __all__ (#11648)
Sort the binding IDs before passing them to the add-to-`__all__`
function to address #11619.
2024-05-31 20:29:08 +00:00
Carl Meyer
27f6f048f0 [red-knot] initial (very incomplete) flow graph (#11624)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Introduces the skeleton of the flow graph. So far it doesn't actually
handle any non-linear control flow :) But it does show how we can go
from an expression that references a symbol, backward through the flow
graph, to find reachable definitions of that symbol.

Adding non-linear control flow will mean adding flow nodes with multiple
predecessors, which will introduce more complexity into
`ReachableDefinitionsIterator.next()`. But one step at a time.

## Test Plan

Added a (very basic) test.
2024-05-31 14:27:17 -06:00
Alex Waygood
d62a617938 red-knot: Don't refer to Module instances as IDs (#11649) 2024-05-31 20:04:47 +00:00
Carl Meyer
16a926d138 [red-knot] infer int literal types (#11623)
## Summary

Give red-knot the ability to infer int literal types. This is quick and
easy, mostly because these types are a convenient way to observe
control-flow handling with simple assignments.

## Test Plan

Added test.
2024-05-31 13:52:29 -06:00
Jakub Marcowski
05566c6075 Update Who's Using Ruff? section to include Godot (#11647)
## Summary

- Ever since https://github.com/godotengine/godot/pull/90457 was merged
into the `master` branch, Godot has been using ruff for linting and
formatting Python files. As such, this PR adds Godot to the "Who's Using
Ruff?" section of the main `README.md` file.

## Test Plan

- N/A
2024-05-31 15:33:39 -04:00
JaRoSchm
7ce17b7736 Add Vim and Kate setup guide for ruff server (#11615)
## Summary

In the [roadmap for `ruff
server`](https://github.com/astral-sh/ruff/discussions/10581) support
for vim and kate is listed. Therefore I added setup guides for them
based on the neovim guide. As I don't use pyright I wasn't able to
translate the corresponding part from the neovim guide.

## Test Plan

Doesn't apply.
2024-05-31 19:06:55 +00:00
Charlie Marsh
f9a64503c8 Use char index rather than position for indent slice (#11645)
## Summary

A beginner's mistake :)

Closes https://github.com/astral-sh/ruff/issues/11641.
2024-05-31 19:04:36 +00:00
Alex Waygood
8a25531a71 red-knot: improve internal documentation in module.rs (#11638) 2024-05-31 16:11:18 +00:00
Micha Reiser
9b6d2ce1f2 Fix incorect placement of trailing stub function comments (#11632) 2024-05-31 12:06:17 +00:00
Carl Meyer
889667ad84 [red-knot] Update CODEOWNERS (#11625)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-05-31 06:47:53 +00:00
T-256
5b500fc4dc ruff server: Add support for documents not exist on disk (#11588)
Co-authored-by: T-256 <Tester@test.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-05-31 08:34:10 +02:00
Charlie Marsh
685d11a909 Mark repeated-isinstance-calls as unsafe on Python 3.10 and later (#11622)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11616.
2024-05-30 18:05:24 +00:00
plredmond
dcabd04caf F401 use BTreeMap instead of FxHashMap (#11621)
* Potentially resolves #11619 (nondeterministic hashmap order across
different architectures) in F401 by replacing a hashmap with
nondeterministic traversal order with an ordered mapping.

I'm not sure how to test this with our CI/CD. I don't have an s390x
machine at home. Should I try it in Qemu?
2024-05-30 10:54:46 -07:00
Charlie Marsh
3aa7e35a4c Avoid removing newlines between docstring headers and rST blocks (#11609)
Given:

```python
def func():
    """
    Example:

    .. code-block:: python

        import foo
    """
```

Removing the newline after the `Example:` header breaks Sphinx
rendering.

See: https://github.com/astral-sh/ruff/issues/11577
2024-05-30 13:29:20 -04:00
Micha Reiser
b0a751012e Document bump to win 10 (#11613) 2024-05-30 07:49:38 +00:00
Charlie Marsh
bd46cd1fcf Infer indentation with imports when logical indent is absent (#11608)
## Summary

In an `__init__.py` file, it's not uncommon to lack a logical indent
(since it may just contain imports). In such cases, we were always
falling back to four-space indent. This PR adds detection for indents
within import groups.

Closes https://github.com/astral-sh/ruff/issues/11606.
2024-05-30 00:18:07 -04:00
Charlie Marsh
a8d1328c1a [flake8-comprehension] Strip parentheses around generators in C400 (#11607)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11603.
2024-05-30 03:26:56 +00:00
Christoph Hasse
e35deee583 fix(F822): add option to enable F822 in __init__.py files (#11370)
## Summary

This PR aims to close #10095 by adding an option
`init-allow-undef-export` to the `pyflakes` settings. This option is
currently set to `true` such that behavior is kept identical.
But setting this option to `false` will lead to `F822` warnings to be
shown in all files, **including** `__init__.py` files.

As I've mentioned on #10095, I think `init-allow-undef-export=false`
would be the more user-friendly default option, as it creates fewer
surprises. @charliermarsh what do you think about making that the
default?

With this option in place, it's a single line fix for people that rely
on the old behavior.

And thinking longer term, for future major releases, one could probably
consider deprecating the option and eventually having people just `noqa`
these warnings if they are not wanted.


## Test Plan

I've added a `test_init_f822_enabled` test which repeats the test that
is done in the `init` test but this time with
`init-allow-undef-export=false` and the snap file correctly shows that
ruff will then trigger the otherwise suppressed F822 warning.


closes #10095
2024-05-30 03:15:05 +00:00
Micha Reiser
921bc15542 use owned ast and tokens in bench (#11598) 2024-05-29 18:10:32 +02:00
Vitaliy
e14096f0a8 docs: Minor formatting typo in F401 example. (#11601)
## Summary

Removed stray space in sample code snippet that is against ruff's own
default formatting rules.

This documentation appears on
https://docs.astral.sh/ruff/rules/unused-import/

## Test Plan

This is a trivially obvious change, verifiable with `ruff format
--check`
2024-05-29 11:14:53 -04:00
T-256
5f976cae07 Windows: Statically linked C runtime (#11589)
Co-authored-by: T-256 <Tester@test.com>
2024-05-29 14:00:12 +02:00
Tomas R
7659114eb3 [flake8-pyi] Implement PYI057 (#11486)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-05-29 10:04:36 +00:00
Micha Reiser
163c374242 Reduce extensive use of snapshot.query (#11596) 2024-05-29 10:11:46 +02:00
Charlie Marsh
204c59e353 Respect file exclusions in ruff server (#11590)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11587.

## Test Plan

- Added a lint error to `test_server.py` in `vscode-ruff`.
- Validated that, prior to this change, diagnostics appeared in the
file.
- Validated that, with this change, no diagnostics were shown.
- Validated that, with this change, no diagnostics were fixed on-save.
2024-05-29 02:58:36 +00:00
Tushar Sadhwani
531ae5227c [flake8-pyi] Implement PYI066 (#11541)
## Summary

- Implements `Y066` from `flake8-pyi` as `PYI066`
- Fixes `PYI006` not being raised for `elif` clauses. This would have
conflicted with PYI006's implementation, so decided to do it in the same
PR.

## Test Plan

`cargo test` / `cargo insta review`
2024-05-29 00:30:00 +00:00
Tushar Sadhwani
e0169d8dea [flake8-pyi] Implement PYI064 (#11325)
## Summary

Implements `Y064` from `flake8-pyi` and its autofix.

## Test Plan

`cargo test` / `cargo insta review`
2024-05-28 23:57:13 +00:00
plredmond
9a3b9f9fb5 [redknot] add module type and attribute lookup for some types (#11416)
* Add a module type, `ModuleTypeId`
* Add an attribute lookup method `get_member` for `Type`
  * Only implemented for `ModuleTypeId` and `ClassTypeId`
  * [x] Should this be a trait?
    *Answer: no*
* [x] Uses `unwrap`, but we should remove that. Maybe add a new variant
to `QueryError`?
    *Answer: Return `Option<Type>` as is done elsewhere*
* Add `infer_definition_type` case for `Import`
* Add `infer_expr_type` case for `Attribute`
* Add a test to exercise these
* [x] remove all NOTE/FIXME/TODO after discussing with reviewers
2024-05-28 13:13:03 -07:00
Charlie Marsh
49a5a9ccc2 Bump version to v0.4.6 (#11585) 2024-05-28 15:10:53 -04:00
Charlie Marsh
69d9212817 Propagate reads on global variables (#11584)
## Summary

This PR ensures that if a variable is bound via `global`, and then the
`global` is read, the originating variable is also marked as read. It's
not perfect, in that it won't detect _rebindings_, like:

```python
from app import redis_connection

def func():
    global redis_connection

    redis_connection = 1
    redis_connection()
```

So, above, `redis_connection` is still marked as unused.

But it does avoid flagging `redis_connection` as unused in:

```python
from app import redis_connection

def func():
    global redis_connection

    redis_connection()
```

Closes https://github.com/astral-sh/ruff/issues/11518.
2024-05-28 14:47:05 -04:00
Akshet Pandey
4a305588e9 [flake8-bandit] request-without-timeout should warn for requests.request (#11548)
## Summary
Update
[S113](https://docs.astral.sh/ruff/rules/request-without-timeout/) to
also warns for missing timeout on when calling `requests.request`
2024-05-28 16:31:12 +00:00
Charlie Marsh
16acd4913f Remove some unused pub functions (#11576)
## Summary

I left anything in `red-knot`, any `with_` methods, etc.
2024-05-28 09:56:51 -04:00
Micha Reiser
3989cb8b56 Make ruff_notebook a workspace dependency in ruff_server (#11572) 2024-05-28 09:26:39 +02:00
Charlie Marsh
a38c05bf13 Avoid recommending context manager in __enter__ implementations (#11575)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11567.
2024-05-28 01:44:24 +00:00
Charlie Marsh
ab107ef1f3 Avoid recomending operator.itemgetter with dependence on lambda arg (#11574)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11573.
2024-05-28 01:29:29 +00:00
Ahmed Ilyas
b36c713279 Consider irrefutable pattern similar to if .. else for C901 (#11565)
## Summary

Follow up to https://github.com/astral-sh/ruff/pull/11521

Removes the extra added complexity for catch all match cases. This
matches the implementation of plain `else` statements.

## Test Plan
Added new test cases.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-05-27 17:33:36 +00:00
Charlie Marsh
34a5063aa2 Respect excludes in ruff server configuration discovery (#11551)
## Summary

Right now, we're discovering configuration files even within (e.g.)
virtual environments, because we're recursing without respecting the
`exclude` field on parent configuration.

Closes https://github.com/astral-sh/ruff-vscode/issues/478.

## Test Plan

Installed Pandas; verified that I saw no warnings:

![Screenshot 2024-05-26 at 8 09
05 PM](https://github.com/astral-sh/ruff/assets/1309177/dcf4115c-d7b3-453b-b7c7-afdd4804d6f5)
2024-05-27 16:59:46 +00:00
Micha Reiser
adc0a5d126 Rename document module to text_document (#11571) 2024-05-27 18:32:21 +02:00
Dhruv Manilawala
e28e737296 Update FStringElements to deref to a slice (#11570)
Ref: https://github.com/astral-sh/ruff/pull/11400#discussion_r1615600354
2024-05-27 15:52:13 +00:00
Dhruv Manilawala
37ad994318 Use default settings if initialization options is empty or not provided (#11566)
## Summary

This PR fixes the bug to avoid flattening the global-only settings for
the new server.

This was added in https://github.com/astral-sh/ruff/pull/11497, possibly
to correctly de-serialize an empty value (`{}`). But, this lead to a bug
where the configuration under the `settings` key was not being read for
global-only variant.

By using #[serde(default)], we ensure that the settings field in the
`GlobalOnly` variant is optional and that an empty JSON object `{}` is
correctly deserialized into `GlobalOnly` with a default `ClientSettings`
instance.

fixes: #11507 

## Test Plan

Update the snapshot and existing test case. Also, verify the following
settings in Neovim:

1. Nothing

```lua
ruff = {
  cmd = {
    '/Users/dhruv/work/astral/ruff/target/debug/ruff',
    'server',
    '--preview',
  },
}
```

2. Empty dictionary

```lua
ruff = {
  cmd = {
    '/Users/dhruv/work/astral/ruff/target/debug/ruff',
    'server',
    '--preview',
  },
  init_options = vim.empty_dict(),
}
```

3. Empty `settings`

```lua
ruff = {
  cmd = {
    '/Users/dhruv/work/astral/ruff/target/debug/ruff',
    'server',
    '--preview',
  },
  init_options = {
    settings = vim.empty_dict(),
  },
}
```

4. With some configuration:

```lua
ruff = {
  cmd = {
    '/Users/dhruv/work/astral/ruff/target/debug/ruff',
    'server',
    '--preview',
  },
  init_options = {
    settings = {
      configuration = '/tmp/ruff-repro/pyproject.toml',
    },
  },
}
```
2024-05-27 21:06:34 +05:30
Alex Waygood
246a3388ee Implement a common trait for the string flags (#11564) 2024-05-27 16:02:01 +01:00
Evan Kohilas
6be00d5775 Adds recommended extension settings for vscode (#11519) 2024-05-27 13:04:32 +02:00
Dhruv Manilawala
9200dfc79f Remove empty strings when converting to f-string (UP032) (#11524)
## Summary

This PR brings back the functionality to remove empty strings when
converting to an f-string in `UP032`.

For context, https://github.com/astral-sh/ruff/pull/8712 added this
functionality to remove _trailing_ empty strings but it got removed in
https://github.com/astral-sh/ruff/pull/8697 possibly unexpectedly so.

There's one difference which is that this PR will remove _any_ empty
strings and not just trailing ones. For example,

```diff
--- /Users/dhruv/playground/ruff/src/UP032.py
+++ /Users/dhruv/playground/ruff/src/UP032.py
@@ -1,7 +1,5 @@
 (
-    "{a}"
-    ""
-    "{b}"
-    ""
-).format(a=1, b=1)
+    f"{1}"
+    f"{1}"
+)
```

## Test Plan

Run `cargo insta test` and update the snapshots.
2024-05-27 05:05:22 +00:00
renovate[bot]
5dcde88099 Update Rust crate thiserror to v1.0.61 (#11561) 2024-05-27 00:33:54 +00:00
renovate[bot]
7794eb2bde Update Rust crate proc-macro2 to v1.0.84 (#11556) 2024-05-26 20:21:50 -04:00
renovate[bot]
40bfae4f99 Update Rust crate syn to v2.0.66 (#11560) 2024-05-27 00:21:44 +00:00
renovate[bot]
7b064b25b2 Update Rust crate mimalloc to v0.1.42 (#11554) 2024-05-26 20:21:39 -04:00
renovate[bot]
9993115f63 Update Rust crate smol_str to v0.2.2 (#11559) 2024-05-26 20:21:25 -04:00
renovate[bot]
f0a21c9161 Update Rust crate serde to v1.0.203 (#11558) 2024-05-26 20:21:19 -04:00
renovate[bot]
f26c155de5 Update Rust crate schemars to v0.8.21 (#11557) 2024-05-26 20:21:13 -04:00
renovate[bot]
c3fa826b0a Update Rust crate parking_lot to v0.12.3 (#11555) 2024-05-26 20:21:03 -04:00
renovate[bot]
8b69794f1d Update Rust crate libc to v0.2.155 (#11553) 2024-05-26 20:20:47 -04:00
renovate[bot]
4e7c84df1d Update Rust crate anyhow to v1.0.86 (#11552) 2024-05-26 20:20:38 -04:00
Dhruv Manilawala
99c400000a Avoid owned token data in sequence sorting (#11533)
## Summary

This PR updates the sequence sorting (`RUF022` and `RUF023`) to avoid
using the owned data from the string token. Instead, we will directly
use the reference to the data on the AST. This does introduce a lot of
lifetimes but that's required.

The main motivation for this is to allow removing the `lex_starts_at`
usage easily.

### Alternatives

1. Extract the raw string content (stripping the prefix and quotes)
using the `Locator` and use that for comparison
2. Build up an
[`IndexVec`](3e30962077/crates/ruff_index/src/vec.rs)
and use the newtype index in place of the string value itself. This also
does require lifetimes so we might as well just use the method in this
PR.

## Test Plan

`cargo insta test` and no ecosystem changes
2024-05-26 20:20:20 -04:00
Charlie Marsh
b5d147d219 Create intermediary directories for --output-file (#11550)
Closes https://github.com/astral-sh/ruff/issues/11549.
2024-05-26 23:23:11 +00:00
Aleksei Latyshev
77da4615c1 [pyupgrade] Support TypeAliasType in UP040 (#11530)
## Summary
Lint `TypeAliasType` in UP040.

Fixes #11422 

## Test Plan

cargo test
2024-05-26 19:05:35 +00:00
Jane Lewis
627d230688 ruff server searches for configuration in parent directories (#11537)
## Summary

Fixes #11506.

`RuffSettingsIndex::new` now searches for configuration files in parent
directories.

## Test Plan

I confirmed that the original test case described in the issue worked as
expected.
2024-05-26 18:11:08 +00:00
Fergus Longley
0eef834e89 Use project-relative path when calculating gitlab message fingerprint (#11532)
## Summary

Concurrent GitLab runners clone projects into separate directories, e.g.
`{builds_dir}/$RUNNER_TOKEN_KEY/$CONCURRENT_ID/$NAMESPACE/$PROJECT_NAME`.
Since the fingerprint uses the full path to the file, the fingerprints
calculated by Ruff are different depending on which concurrent runner it
executes on, so often an MR will appear to remove all existing issues
and add them with new fingerprints.

I've adjusted the fingerprint function to use the project relative path,
which fixes this. Unfortunately this will have a breaking change for any
current users of this output - the fingerprints will change and appear
in GitLab as all linting messages having been fixed and then created.

## Test Plan

`cargo nextest run`

Running `ruff check --output-format gitlab` in a git repo, moving the
repo and running again, verifying no diffs between the outputs
2024-05-26 14:10:04 -04:00
Charlie Marsh
650c578e07 [flake8-self] Ignore sunder accesses in flake8-self rule (#11546)
## Summary

We already ignore dunders, so ignoring sunders (as in
https://docs.python.org/3/library/enum.html#supported-sunder-names)
makes sense to me.
2024-05-26 13:57:24 -04:00
Jane Lewis
9567fddf69 ruff server correctly treats .pyi files as stub files (#11535)
## Summary

Fixes #11534.

`DocumentQuery::source_type` now returns `PySourceType::Stub` when the
document is a `.pyi` file.

## Test Plan

I confirmed that stub-specific rule violations appeared with a build
from this PR (they were not visible from a `main` build).

<img width="1066" alt="Screenshot 2024-05-24 at 2 15 38 PM"
src="https://github.com/astral-sh/ruff/assets/19577865/cd519b7e-21e4-41c8-bc30-43eb6d4d438e">
2024-05-26 13:42:48 -04:00
Mateusz Sokół
ab6d9d4658 Add missing functions to NumPy 2.0 migration rule (#11528)
Hi! 

I left out some of the functions in the migration rule which became
removed in NumPy 2.0:
- `np.alltrue`
- `np.anytrue`
- `np.cumproduct`
- `np.product`

Addressing: https://github.com/numpy/numpy/issues/26493
2024-05-26 13:24:20 -04:00
Amar Paul
677893226a [flake8-2020] fix minor typo in YTT301 documentation (#11543)
## Summary

<!-- What's the purpose of the change? What does it do, and why? -->
Current doc says `sys.version[0]` will select the first digit of a major
version number (correct) then as an example says

> e.g., `"3.10"` would evaluate to `"1"`

(would actually evaluate to `"3"`). Changed the example version to a
two-digit number to make the problem more clear.

## Test Plan

<!-- How was it tested? -->
ran the following:
- `cargo run -p ruff -- check
crates/ruff_linter/resources/test/fixtures/flake8_2020/YTT301.py
--no-cache`
- `cargo insta review`
- `cargo test`
which all passed.
2024-05-26 13:23:41 -04:00
Ahmed Ilyas
33fd50027c Consider match-case stmts for C901, PLR0912, and PLR0915 (#11521)
Resolves #11421

## Summary

Instead of counting match/case as one statement, consider each `case` as
a conditional.

## Test Plan

`cargo test`
2024-05-24 14:44:46 +05:30
Dmitry Bogorad
3e30962077 [flake8-logging-format] Fix the autofix title in logging-warn (G010) (#11514)
## Summary

Rule `logging-warn` (`G010`) prescribes a change from `warn` to
`warning` and has a corresponding autofix, but the autofix is mistakenly
titled ```"Convert to `warn`"``` instead of ```"Convert to `warning`"```
(the latter is what the autofix actually does). Seems to be a plain
typo.
2024-05-24 13:13:42 +05:30
Jane Lewis
81275a6c3d ruff server: An empty code action filter no longer returns notebook source actions (#11526)
## Summary

Fixes #11516

`ruff server` was sending both regular source actions and notebook
source actions back when passed an empty action filter. This PR makes a
few small changes so that notebook source actions are not sent when
regular source actions are sent, which means that an empty filter will
only return regular source actions.

## Test Plan

I confirmed that duplicate code actions no longer appeared in Neovim,
using a configuration similar to the one from the original issue.

<img width="509" alt="Screenshot 2024-05-23 at 11 48 48 PM"
src="https://github.com/astral-sh/ruff/assets/19577865/9a5d6907-dd41-48bd-b015-8a344c5e0b3f">
2024-05-24 07:20:39 +00:00
Charlie Marsh
52c946a4c5 Treat all singledispatch arguments as runtime-required (#11523)
## Summary

It turns out that `singledispatch` does end up evaluating all arguments,
even though only the first is used to dispatch.

Closes https://github.com/astral-sh/ruff/issues/11520.
2024-05-23 20:36:24 -04:00
Evan Kohilas
ebdaf5765a [flake8-async] Sleep with >24 hour interval should usually sleep forever (ASYNC116) (#11498)
## Summary

Addresses #8451 by implementing rule 116 to add an unsafe fix when sleep
is used with a >24 hour interval to instead consider sleeping forever.

This rule is added as async instead as I my understanding was that these
trio rules would be moved to async anyway.

There are a couple of TODOs, which address further extending the rule by
adding support for lookups and evaluations, and also supporting `anyio`.
2024-05-23 16:25:50 -04:00
Christian Adell
9a93409e1c Update README.md - new Ruff user (#11509) 2024-05-23 15:50:17 -04:00
Dhruv Manilawala
102b9d930f Use Importer available on Checker (#11513)
## Summary

This PR updates the `FA102` rule logic to use the `Importer` which is
available on the `Checker`.

The main motivation is that this would make updating the `Importer` to
use the `Tokens` struct which will be required to remove the
`lex_starts_at` usage in `Insertion::start_of_block` method.

## Test Plan

`cargo insta test`
2024-05-23 11:19:08 +00:00
Jane Lewis
550aa871d3 Bump version to v0.4.5 (#11502) 2024-05-23 01:09:01 +00:00
Charlie Marsh
3c22a3bdcc Minor edits to ruff server docs (#11500)
## Summary

Minor copy edits based on my read-through. Feel free to disagree
anywhere.
2024-05-22 23:53:53 +00:00
Jane Lewis
6263923915 Update documentation for ruff server with new migration guide (#11499)
## Summary

Introduces a migration guide from `ruff-lsp` to `ruff server` and makes
small updates to the `README.md`.
2024-05-22 14:36:33 -07:00
Jane Lewis
94abea4b08 ruff server: Fix multiple issues with Neovim and Helix (#11497)
## Summary

Fixes https://github.com/astral-sh/ruff/issues/11236.

This PR fixes several issues, most of which relate to non-VS Code
editors (Helix and Neovim).

1. Global-only initialization options are now correctly deserialized
from Neovim and Helix
2. Empty diagnostics are now published correctly for Neovim and Helix.
3. A workspace folder is created at the current working directory if the
initialization parameters send an empty list of workspace folders.
4. The server now gracefully handles opening files outside of any known
workspace, and will use global fallback settings taken from client
editor settings and a user settings TOML, if it exists.

## Test Plan

I've tested to confirm that each issue has been fixed.

* Global-only initialization options are now correctly deserialized from
Neovim and Helix + the server gracefully handles opening files outside
of any known workspace


https://github.com/astral-sh/ruff/assets/19577865/4f33477f-20c8-4e50-8214-6608b1a1ea6b

* Empty diagnostics are now published correctly for Neovim and Helix


https://github.com/astral-sh/ruff/assets/19577865/c93f56a0-f75d-466f-9f40-d77f99cf0637

* A workspace folder is created at the current working directory if the
initialization parameters send an empty list of workspace folders.



https://github.com/astral-sh/ruff/assets/19577865/b4b2e818-4b0d-40ce-961d-5831478cc726
2024-05-22 20:50:58 +00:00
Charlie Marsh
519a65007f Mark quotes as unnecessary for non-evaluated annotations (#11485)
## Summary

Similar to #11414, this PR extends `UP037` to flag quoted annotations
that are located in positions that won't be evaluated at runtime.

For example, the quotes on `Tuple` are unnecessary in:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from typing import Tuple


def foo():
    x: "Tuple[int, int]" = (0, 0)

foo()
```
2024-05-22 15:44:31 -04:00
Jane Lewis
573facd2ba Fix automatic configuration reloading for text and notebook documents (#11492)
## Summary

Recent changes made in the [Jupyter Notebook feature
PR](https://github.com/astral-sh/ruff/pull/11206) caused automatic
configuration reloading to stop working. This was because we would check
for paths to reload using the changed path, when we should have been
using the parent path of the changed path (to get the directory it was
changed in).

Additionally, this PR fixes an issue where `ruff.toml` and `.ruff.toml`
files were not being automatically reloaded.

Finally, this PR improves configuration reloading by actively publishing
diagnostics for notebook documents (which won't be affected by the
workspace refresh since they don't use pull diagnostics). It will also
publish diagnostics for text documents if pull diagnostics aren't
supported.

## Test Plan
To test this, open an existing configuration file in a codebase, and
make modifications that will affect one or more open Python / Jupyter
Notebook files. You should observe that the diagnostics for both kinds
of files update automatically when the file changes are saved.

Here's a test video showing what a successful test should look like:



https://github.com/astral-sh/ruff/assets/19577865/7172b598-d6de-4965-b33c-6cb8b911ef6c
2024-05-22 11:20:45 -07:00
Jane Lewis
3cb2e677aa ruff.applyFormat now formats an entire notebook document (#11493)
## Summary

Previously, `ruff.applyFormat`, seen in VS Code as the command `Ruff:
Format Document`, would only format the currently active notebook cell
inside a notebook document. This PR makes `ruff.applyFormat` format the
entire notebook document at once, operating on each code cell in order.

## Test Plan

1. Open a notebook document that has multiple unformatted code cells.
2. Run `Ruff: Format Document` through the Command Palette
(`Ctrl/Cmd+Shift+P` by default)
3. Observe that all code cells in the notebook have been formatted.
2024-05-22 09:02:46 -07:00
Dhruv Manilawala
f0046ab28e Move has_comments to CommentRanges (#11495)
## Summary

This PR moves the `has_comments` function from `Indexer` to
`CommentRanges`. The main motivation is that the `CommentRanges` will
now be built by the parser which is shared between the linter and the
formatter. Thus, the `CommentRanges` will be removed from the `Indexer`.

## Test Plan

`cargo test`
2024-05-22 13:35:16 +00:00
Charlie Marsh
5bb9720a10 Avoid multiline quotes warning with quote-style = preserve (#11490)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11063.
2024-05-22 04:31:03 +00:00
Dhruv Manilawala
9ff18bf9d3 Simplify Neovim docs for the LSP setup (#11489)
Similar to what we have at
https://github.com/astral-sh/ruff-lsp#example-neovim
2024-05-22 09:51:02 +05:30
Charlie Marsh
aa906b9c75 [pylint] Ignore __slots__ with dynamic values (#11488)
## Summary

Closes https://github.com/astral-sh/ruff/issues/11333.
2024-05-22 04:18:01 +00:00
Evan Kohilas
3476e2f359 fixes invalid rule from hyphen (#11484)
## Summary

When using `add_rule.py`, it produces the following line in `codes.rs`
```
        (Flake8Async, "102") => (RuleGroup::Stable, rules::flake8-async::rules::BlockingOsCallInAsyncFunction),
```

Causing a syntax error.

This PR resolves that issue so that the script can be used again.

## Test Plan

Tested manually in new rule creation
2024-05-21 23:39:50 -04:00
Charlie Marsh
8848eca3c6 [pylint] Remove try body from branch counting (#11487)
## Summary

Matching Pylint, we now omit the `try` body itself from branch counting.
Each `except` counts as a branch, as does the `else` and the `finally`.

Closes https://github.com/astral-sh/ruff/issues/11205.
2024-05-21 23:38:51 -04:00
Jane Lewis
b0731ef9cb ruff server: Support Jupyter Notebook (*.ipynb) files (#11206)
## Summary

Closes https://github.com/astral-sh/ruff/issues/10858.

`ruff server` now supports `*.ipynb` (aka Jupyter Notebook) files.
Extensive internal changes have been made to facilitate this, which I've
done some work to contextualize with documentation and an pre-review
that highlights notable sections of the code.

`*.ipynb` cells should behave similarly to `*.py` documents, with one
major exception. The format command `ruff.applyFormat` will only apply
to the currently selected notebook cell - if you want to format an
entire notebook document, use `Format Notebook` from the VS Code context
menu.

## Test Plan

The VS Code extension does not yet have Jupyter Notebook support
enabled, so you'll first need to enable it manually. To do this,
checkout the `pre-release` branch and modify `src/common/server.ts` as
follows:

Before:
![Screenshot 2024-05-13 at 10 59
06 PM](https://github.com/astral-sh/ruff/assets/19577865/c6a3c604-c405-4968-b8a2-5d670de89172)

After:
![Screenshot 2024-05-13 at 10 58
24 PM](https://github.com/astral-sh/ruff/assets/19577865/94ab2e3d-0609-448d-9c8c-cd07c69a513b)

I recommend testing this PR with large, complicated notebook files. I
used notebook files from [this popular
repository](https://github.com/jakevdp/PythonDataScienceHandbook/tree/master/notebooks)
in my preliminary testing.

The main thing to test is ensuring that notebook cells behave the same
as Python documents, besides the aforementioned issue with
`ruff.applyFormat`. You should also test adding and deleting cells (in
particular, deleting all the code cells and ensure that doesn't break
anything), changing the kind of a cell (i.e. from markup -> code or vice
versa), and creating a new notebook file from scratch. Finally, you
should also test that source actions work as expected (and across the
entire notebook).

Note: `ruff.applyAutofix` and `ruff.applyOrganizeImports` are currently
broken for notebook files, and I suspect it has something to do with
https://github.com/astral-sh/ruff/issues/11248. Once this is fixed, I
will update the test plan accordingly.

---------

Co-authored-by: nolan <nolan.king90@gmail.com>
2024-05-21 22:29:30 +00:00
Nicolas Jeker
84531d1644 Clarify motivation for E713 and E714 (#11483)
The wording 'negative comparison' is a rather vague description of the
'is not' operation and does not describe what the 'not in' operation
does (potentially copied from 'is not'). This was replaced with more
precise language to describe the operators taken from the official
python docs[1].

Both rules didn't have a strong reasoning besides 'it's bad, use the
other'. The origin of these rules seems to be PEP8[2] which prefers 'is
not' over 'not ... is' for readability. This is now reflected in the
description.

[1]:
https://docs.python.org/3/reference/expressions.html#membership-test-operations
[2]: https://peps.python.org/pep-0008/#programming-recommendations
2024-05-21 14:12:18 -05:00
Charlie Marsh
83b8b62e3e Avoid flagging __future__ annotations as required for non-evaluated type annotations (#11414)
## Summary

If an annotation won't be evaluated at runtime, we don't need to flag
`from __future__ import annotations` as required. This applies both to
quoted annotations and annotations outside of runtime-evaluated
positions, like:

```python
def main() -> None:
    a_list: list[str] | None = []
    a_list.append("hello")
```

Closes https://github.com/astral-sh/ruff/issues/11397.
2024-05-21 18:57:13 +00:00
plredmond
7225732859 F401 - update documentation and deprecate ignore_init_module_imports (#11436)
## Summary

* Update documentation for F401 following recent PRs
  * #11168
  * #11314
* Deprecate `ignore_init_module_imports`
* Add a deprecation pragma to the option and a "warn user once" message
when the option is used.
* Restore the old behavior for stable (non-preview) mode:
* When `ignore_init_module_imports` is set to `true` (default) there are
no `__init_.py` fixes (but we get nice fix titles!).
* When `ignore_init_module_imports` is set to `false` there are unsafe
`__init__.py` fixes to remove unused imports.
* When preview mode is enabled, it overrides
`ignore_init_module_imports`.
* Fixed a bug in fix titles where `import foo as bar` would recommend
reexporting `bar as bar`. It now says to reexport `foo as foo`. (In this
case we don't issue a fix, fwiw; it was just a fix title bug.)

## Test plan

Added new fixture tests that reuse the existing fixtures for
`__init__.py` files. Each of the three situations listed above has
fixture tests. The F401 "stable" tests cover:

> * When `ignore_init_module_imports` is set to `true` (default) there
are no `__init_.py` fixes (but we get nice fix titles!).

The F401 "deprecated option" tests cover:

> * When `ignore_init_module_imports` is set to `false` there are unsafe
`__init__.py` fixes to remove unused imports.

These complement existing "preview" tests that show the new behavior
which recommends fixes in `__init__.py` according to whether the import
is 1st party and other circumstances (for more on that behavior see:
#11314).
2024-05-21 09:23:45 -07:00
Dhruv Manilawala
403f0dccd8 Consider soft keywords for E27 rules (#11446)
## Summary

This is a follow-up PR to #11445 update the `E27` rules to consider soft
keywords as well.

## Test Plan

Add test cases consisting of soft keywords and update the snapshot.
2024-05-20 05:38:06 +00:00
Zanie Blue
46fcd19ca6 Fix division by zero error in ecosystem check (#11469)
e.g.
https://github.com/astral-sh/ruff/actions/runs/9144809516/job/25143076896?pr=11468

<img width="1388" alt="Screenshot 2024-05-19 at 12 02 15 AM"
src="https://github.com/astral-sh/ruff/assets/2586601/0df7cbcd-712c-4ea9-96f5-73f871570525">
2024-05-19 09:08:10 -05:00
Charlie Marsh
d9ec3d56b0 Add some new projects to the ecosystem CI (#11468)
Co-authored-by: Zanie Blue <contact@zanie.dev>
2024-05-19 08:08:38 -05:00
Auguste Lalande
cd87b787d9 Fix windows-ci failure (#11470)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

The recent issues with the windows CI seem to be caused by
https://github.com/nextest-rs/nextest/issues/1493. With this
https://github.com/nextest-rs/nextest/issues/1493#issuecomment-2106331574
as a fix.

(Let's see if it works)
2024-05-19 07:25:06 -05:00
Charlie Marsh
dd6d411026 Remove comma from ecosystem checks (#11466)
## Summary

Something's up with this repo -- they added a post-checkout hook? So
let's just remove it for now. We should go through and add a new batch
of repositories some time.
2024-05-18 23:37:56 -04:00
Charlie Marsh
cfceb437a8 Treat escaped newline as valid sequence (#11465)
## Summary

We weren't treating the escaped newline as a valid condition to trigger
the safer fix (add an extra backslash before each invalid escape
sequence).

Closes https://github.com/astral-sh/ruff/issues/11461.
2024-05-19 03:32:32 +00:00
Charlie Marsh
48b0660228 Respect operator precedence in FURB110 (#11464)
## Summary

Ensures that we parenthesize expressions (if necessary) to preserve
operator precedence in `FURB110`.

Closes https://github.com/astral-sh/ruff/issues/11398.
2024-05-19 03:17:11 +00:00
Charlie Marsh
24899efe50 Remove example from tab-indentation (#11462)
## Summary

I think the example is more confusing than helpful, since there's no
visual difference between the tab and space here (even if it rendered
properly).

Closes
https://github.com/astral-sh/ruff/issues/11460#issuecomment-2118397278.
2024-05-17 17:49:16 -04:00
1356 changed files with 44676 additions and 16969 deletions

View File

@@ -1,3 +1,10 @@
[alias]
dev = "run --package ruff_dev --bin ruff_dev"
benchmark = "bench -p ruff_benchmark --bench linter --bench formatter --"
# statically link the C runtime so the executable does not depend on
# that shared/dynamic library.
#
# See: https://github.com/astral-sh/ruff/issues/11503
[target.'cfg(all(target_env="msvc", target_os = "windows"))']
rustflags = ["-C", "target-feature=+crt-static"]

4
.gitattributes vendored
View File

@@ -8,6 +8,10 @@ crates/ruff_linter/resources/test/fixtures/pycodestyle/W391_3.py text eol=crlf
crates/ruff_python_formatter/resources/test/fixtures/ruff/docstring_code_examples_crlf.py text eol=crlf
crates/ruff_python_formatter/tests/snapshots/format@docstring_code_examples_crlf.py.snap text eol=crlf
crates/ruff_python_parser/resources/invalid/re_lexing/line_continuation_windows_eol.py text eol=crlf
crates/ruff_python_parser/resources/invalid/re_lex_logical_token_windows_eol.py text eol=crlf
crates/ruff_python_parser/resources/invalid/re_lex_logical_token_mac_eol.py text eol=cr
crates/ruff_python_parser/resources/inline linguist-generated=true
ruff.schema.json linguist-generated=true text=auto eol=lf

3
.github/CODEOWNERS vendored
View File

@@ -15,3 +15,6 @@
# Script for fuzzing the parser
/scripts/fuzz-parser/ @AlexWaygood
# red-knot
/crates/red_knot/ @carljm @MichaReiser

View File

@@ -41,6 +41,13 @@
description: "Disable PRs updating GitHub runners (e.g. 'runs-on: macos-14')",
enabled: false,
},
{
// Disable updates of `zip-rs`; intentionally pinned for now due to ownership change
// See: https://github.com/astral-sh/uv/issues/3642
matchPackagePatterns: ["zip"],
matchManagers: ["cargo"],
enabled: false,
},
{
groupName: "pre-commit dependencies",
matchManagers: ["pre-commit"],

View File

@@ -167,6 +167,9 @@ jobs:
- uses: Swatinem/rust-cache@v2
- name: "Run tests"
shell: bash
env:
# Workaround for <https://github.com/nextest-rs/nextest/issues/1493>.
RUSTUP_WINDOWS_PATH_ADD_BIN: 1
run: |
cargo nextest run --all-features --profile ci
cargo test --all-features --doc
@@ -209,6 +212,38 @@ jobs:
- name: "Build"
run: cargo build --release --locked
cargo-build-msrv:
name: "cargo build (msrv)"
runs-on: ubuntu-latest
needs: determine_changes
if: ${{ needs.determine_changes.outputs.code == 'true' || github.ref == 'refs/heads/main' }}
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- uses: SebRollen/toml-action@v1.2.0
id: msrv
with:
file: "Cargo.toml"
field: "workspace.package.rust-version"
- name: "Install Rust toolchain"
run: rustup default ${{ steps.msrv.outputs.value }}
- name: "Install mold"
uses: rui314/setup-mold@v1
- name: "Install cargo nextest"
uses: taiki-e/install-action@v2
with:
tool: cargo-nextest
- name: "Install cargo insta"
uses: taiki-e/install-action@v2
with:
tool: cargo-insta
- uses: Swatinem/rust-cache@v2
- name: "Run tests"
shell: bash
env:
NEXTEST_PROFILE: "ci"
run: cargo +${{ steps.msrv.outputs.value }} insta test --all-features --unreferenced reject --test-runner nextest
cargo-fuzz:
name: "cargo fuzz"
runs-on: ubuntu-latest
@@ -222,10 +257,13 @@ jobs:
- uses: Swatinem/rust-cache@v2
with:
workspaces: "fuzz -> target"
- name: "Install cargo-fuzz"
uses: taiki-e/install-action@v2
- name: "Install cargo-binstall"
uses: cargo-bins/cargo-binstall@main
with:
tool: cargo-fuzz@0.11.2
- name: "Install cargo-fuzz"
# Download the latest version from quick install and not the github releases because github releases only has MUSL targets.
run: cargo binstall cargo-fuzz --force --disable-strategies crate-meta-data --no-confirm
- run: cargo fuzz build -s none
fuzz-parser:
@@ -303,7 +341,7 @@ jobs:
name: ruff
path: target/debug
- uses: dawidd6/action-download-artifact@v3
- uses: dawidd6/action-download-artifact@v6
name: Download baseline Ruff binary
with:
name: ruff

View File

@@ -47,7 +47,7 @@ jobs:
run: mkdocs build --strict -f mkdocs.public.yml
- name: "Deploy to Cloudflare Pages"
if: ${{ env.CF_API_TOKEN_EXISTS == 'true' }}
uses: cloudflare/wrangler-action@v3.5.0
uses: cloudflare/wrangler-action@v3.6.1
with:
apiToken: ${{ secrets.CF_API_TOKEN }}
accountId: ${{ secrets.CF_ACCOUNT_ID }}

View File

@@ -40,7 +40,7 @@ jobs:
working-directory: playground
- name: "Deploy to Cloudflare Pages"
if: ${{ env.CF_API_TOKEN_EXISTS == 'true' }}
uses: cloudflare/wrangler-action@v3.5.0
uses: cloudflare/wrangler-action@v3.6.1
with:
apiToken: ${{ secrets.CF_API_TOKEN }}
accountId: ${{ secrets.CF_ACCOUNT_ID }}

View File

@@ -17,7 +17,7 @@ jobs:
comment:
runs-on: ubuntu-latest
steps:
- uses: dawidd6/action-download-artifact@v3
- uses: dawidd6/action-download-artifact@v6
name: Download pull request number
with:
name: pr-number
@@ -32,7 +32,7 @@ jobs:
echo "pr-number=$(<pr-number)" >> $GITHUB_OUTPUT
fi
- uses: dawidd6/action-download-artifact@v3
- uses: dawidd6/action-download-artifact@v6
name: "Download ecosystem results"
id: download-ecosystem-result
if: steps.pr-number.outputs.pr-number
@@ -48,6 +48,14 @@ jobs:
id: generate-comment
if: steps.download-ecosystem-result.outputs.found_artifact == 'true'
run: |
# Guard against malicious ecosystem results that symlink to a secret
# file on this runner
if [[ -L pr/ecosystem/ecosystem-result ]]
then
echo "Error: ecosystem-result cannot be a symlink"
exit 1
fi
# Note this identifier is used to find the comment to update on
# subsequent runs
echo '<!-- generated-comment ecosystem -->' >> comment.txt

View File

@@ -163,6 +163,9 @@ jobs:
with:
target: ${{ matrix.platform.target }}
args: --release --locked --out dist
env:
# aarch64 build fails, see https://github.com/PyO3/maturin/issues/2110
XWIN_VERSION: 16
- name: "Test wheel"
if: ${{ !startsWith(matrix.platform.target, 'aarch64') }}
shell: bash
@@ -569,7 +572,7 @@ jobs:
fi
- name: "Build and push Docker image"
uses: docker/build-push-action@v5
uses: docker/build-push-action@v6
with:
context: .
platforms: linux/amd64,linux/arm64

View File

@@ -37,13 +37,13 @@ jobs:
- name: Sync typeshed
id: sync
run: |
rm -rf ruff/crates/red_knot/vendor/typeshed
mkdir ruff/crates/red_knot/vendor/typeshed
cp typeshed/README.md ruff/crates/red_knot/vendor/typeshed
cp typeshed/LICENSE ruff/crates/red_knot/vendor/typeshed
cp -r typeshed/stdlib ruff/crates/red_knot/vendor/typeshed/stdlib
rm -rf ruff/crates/red_knot/vendor/typeshed/stdlib/@tests
git -C typeshed rev-parse HEAD > ruff/crates/red_knot/vendor/typeshed/source_commit.txt
rm -rf ruff/crates/red_knot_module_resolver/vendor/typeshed
mkdir ruff/crates/red_knot_module_resolver/vendor/typeshed
cp typeshed/README.md ruff/crates/red_knot_module_resolver/vendor/typeshed
cp typeshed/LICENSE ruff/crates/red_knot_module_resolver/vendor/typeshed
cp -r typeshed/stdlib ruff/crates/red_knot_module_resolver/vendor/typeshed/stdlib
rm -rf ruff/crates/red_knot_module_resolver/vendor/typeshed/stdlib/@tests
git -C typeshed rev-parse HEAD > ruff/crates/red_knot_module_resolver/vendor/typeshed/source_commit.txt
- name: Commit the changes
id: commit
if: ${{ steps.sync.outcome == 'success' }}

View File

@@ -2,7 +2,7 @@ fail_fast: true
exclude: |
(?x)^(
crates/red_knot/vendor/.*|
crates/red_knot_module_resolver/vendor/.*|
crates/ruff_linter/resources/.*|
crates/ruff_linter/src/rules/.*/snapshots/.*|
crates/ruff/resources/.*|
@@ -14,7 +14,7 @@ exclude: |
repos:
- repo: https://github.com/abravalheri/validate-pyproject
rev: v0.17
rev: v0.18
hooks:
- id: validate-pyproject
@@ -32,7 +32,7 @@ repos:
)$
- repo: https://github.com/igorshubovych/markdownlint-cli
rev: v0.40.0
rev: v0.41.0
hooks:
- id: markdownlint-fix
exclude: |
@@ -42,7 +42,7 @@ repos:
)$
- repo: https://github.com/crate-ci/typos
rev: v1.21.0
rev: v1.22.9
hooks:
- id: typos
@@ -56,7 +56,7 @@ repos:
pass_filenames: false # This makes it a lot faster
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.4
rev: v0.4.10
hooks:
- id: ruff-format
- id: ruff

5
.vscode/extensions.json vendored Normal file
View File

@@ -0,0 +1,5 @@
{
"recommendations": [
"rust-lang.rust-analyzer"
]
}

6
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,6 @@
{
"rust-analyzer.check.extraArgs": [
"--all-features"
],
"rust-analyzer.check.command": "clippy",
}

View File

@@ -1,5 +1,217 @@
# Changelog
## 0.4.10
### Parser
- Implement re-lexing logic for better error recovery ([#11845](https://github.com/astral-sh/ruff/pull/11845))
### Rule changes
- \[`flake8-copyright`\] Update `CPY001` to check the first 4096 bytes instead of 1024 ([#11927](https://github.com/astral-sh/ruff/pull/11927))
- \[`pycodestyle`\] Update `E999` to show all syntax errors instead of just the first one ([#11900](https://github.com/astral-sh/ruff/pull/11900))
### Server
- Add tracing setup guide to Helix documentation ([#11883](https://github.com/astral-sh/ruff/pull/11883))
- Add tracing setup guide to Neovim documentation ([#11884](https://github.com/astral-sh/ruff/pull/11884))
- Defer notebook cell deletion to avoid an error message ([#11864](https://github.com/astral-sh/ruff/pull/11864))
### Security
- Guard against malicious ecosystem comment artifacts ([#11879](https://github.com/astral-sh/ruff/pull/11879))
## 0.4.9
### Preview features
- \[`pylint`\] Implement `consider-dict-items` (`C0206`) ([#11688](https://github.com/astral-sh/ruff/pull/11688))
- \[`refurb`\] Implement `repeated-global` (`FURB154`) ([#11187](https://github.com/astral-sh/ruff/pull/11187))
### Rule changes
- \[`pycodestyle`\] Adapt fix for `E203` to work identical to `ruff format` ([#10999](https://github.com/astral-sh/ruff/pull/10999))
### Formatter
- Fix formatter instability for lines only consisting of zero-width characters ([#11748](https://github.com/astral-sh/ruff/pull/11748))
### Server
- Add supported commands in server capabilities ([#11850](https://github.com/astral-sh/ruff/pull/11850))
- Use real file path when available in `ruff server` ([#11800](https://github.com/astral-sh/ruff/pull/11800))
- Improve error message when a command is run on an unavailable document ([#11823](https://github.com/astral-sh/ruff/pull/11823))
- Introduce the `ruff.printDebugInformation` command ([#11831](https://github.com/astral-sh/ruff/pull/11831))
- Tracing system now respects log level and trace level, with options to log to a file ([#11747](https://github.com/astral-sh/ruff/pull/11747))
### CLI
- Handle non-printable characters in diff view ([#11687](https://github.com/astral-sh/ruff/pull/11687))
### Bug fixes
- \[`refurb`\] Avoid suggesting starmap when arguments are used outside call (`FURB140`) ([#11830](https://github.com/astral-sh/ruff/pull/11830))
- \[`flake8-bugbear`\] Avoid panic in `B909` when checking large loop blocks ([#11772](https://github.com/astral-sh/ruff/pull/11772))
- \[`refurb`\] Fix misbehavior of `operator.itemgetter` when getter param is a tuple (`FURB118`) ([#11774](https://github.com/astral-sh/ruff/pull/11774))
## 0.4.8
### Performance
- Linter performance has been improved by around 10% on some microbenchmarks by refactoring the lexer and parser to maintain synchronicity between them ([#11457](https://github.com/astral-sh/ruff/pull/11457))
### Preview features
- \[`flake8-bugbear`\] Implement `return-in-generator` (`B901`) ([#11644](https://github.com/astral-sh/ruff/pull/11644))
- \[`flake8-pyi`\] Implement `PYI063` ([#11699](https://github.com/astral-sh/ruff/pull/11699))
- \[`pygrep_hooks`\] Check blanket ignores via file-level pragmas (`PGH004`) ([#11540](https://github.com/astral-sh/ruff/pull/11540))
### Rule changes
- \[`pyupgrade`\] Update `UP035` for Python 3.13 and the latest version of `typing_extensions` ([#11693](https://github.com/astral-sh/ruff/pull/11693))
- \[`numpy`\] Update `NPY001` rule for NumPy 2.0 ([#11735](https://github.com/astral-sh/ruff/pull/11735))
### Server
- Formatting a document with syntax problems no longer spams a visible error popup ([#11745](https://github.com/astral-sh/ruff/pull/11745))
### CLI
- Add RDJson support for `--output-format` flag ([#11682](https://github.com/astral-sh/ruff/pull/11682))
### Bug fixes
- \[`pyupgrade`\] Write empty string in lieu of panic when fixing `UP032` ([#11696](https://github.com/astral-sh/ruff/pull/11696))
- \[`flake8-simplify`\] Simplify double negatives in `SIM103` ([#11684](https://github.com/astral-sh/ruff/pull/11684))
- Ensure the expression generator adds a newline before `type` statements ([#11720](https://github.com/astral-sh/ruff/pull/11720))
- Respect per-file ignores for blanket and redirected noqa rules ([#11728](https://github.com/astral-sh/ruff/pull/11728))
## 0.4.7
### Preview features
- \[`flake8-pyi`\] Implement `PYI064` ([#11325](https://github.com/astral-sh/ruff/pull/11325))
- \[`flake8-pyi`\] Implement `PYI066` ([#11541](https://github.com/astral-sh/ruff/pull/11541))
- \[`flake8-pyi`\] Implement `PYI057` ([#11486](https://github.com/astral-sh/ruff/pull/11486))
- \[`pyflakes`\] Enable `F822` in `__init__.py` files by default ([#11370](https://github.com/astral-sh/ruff/pull/11370))
### Formatter
- Fix incorrect placement of trailing stub function comments ([#11632](https://github.com/astral-sh/ruff/pull/11632))
### Server
- Respect file exclusions in `ruff server` ([#11590](https://github.com/astral-sh/ruff/pull/11590))
- Add support for documents not exist on disk ([#11588](https://github.com/astral-sh/ruff/pull/11588))
- Add Vim and Kate setup guide for `ruff server` ([#11615](https://github.com/astral-sh/ruff/pull/11615))
### Bug fixes
- Avoid removing newlines between docstring headers and rST blocks ([#11609](https://github.com/astral-sh/ruff/pull/11609))
- Infer indentation with imports when logical indent is absent ([#11608](https://github.com/astral-sh/ruff/pull/11608))
- Use char index rather than position for indent slice ([#11645](https://github.com/astral-sh/ruff/pull/11645))
- \[`flake8-comprehension`\] Strip parentheses around generators in `C400` ([#11607](https://github.com/astral-sh/ruff/pull/11607))
- Mark `repeated-isinstance-calls` as unsafe on Python 3.10 and later ([#11622](https://github.com/astral-sh/ruff/pull/11622))
## 0.4.6
### Breaking changes
- Use project-relative paths when calculating GitLab fingerprints ([#11532](https://github.com/astral-sh/ruff/pull/11532))
- Bump minimum supported Windows version to Windows 10 ([#11613](https://github.com/astral-sh/ruff/pull/11613))
### Preview features
- \[`flake8-async`\] Sleep with >24 hour interval should usually sleep forever (`ASYNC116`) ([#11498](https://github.com/astral-sh/ruff/pull/11498))
### Rule changes
- \[`numpy`\] Add missing functions to NumPy 2.0 migration rule ([#11528](https://github.com/astral-sh/ruff/pull/11528))
- \[`mccabe`\] Consider irrefutable pattern similar to `if .. else` for `C901` ([#11565](https://github.com/astral-sh/ruff/pull/11565))
- Consider `match`-`case` statements for `C901`, `PLR0912`, and `PLR0915` ([#11521](https://github.com/astral-sh/ruff/pull/11521))
- Remove empty strings when converting to f-string (`UP032`) ([#11524](https://github.com/astral-sh/ruff/pull/11524))
- \[`flake8-bandit`\] `request-without-timeout` should warn for `requests.request` ([#11548](https://github.com/astral-sh/ruff/pull/11548))
- \[`flake8-self`\] Ignore sunder accesses in `flake8-self` rules ([#11546](https://github.com/astral-sh/ruff/pull/11546))
- \[`pyupgrade`\] Lint for `TypeAliasType` usages (`UP040`) ([#11530](https://github.com/astral-sh/ruff/pull/11530))
### Server
- Respect excludes in `ruff server` configuration discovery ([#11551](https://github.com/astral-sh/ruff/pull/11551))
- Use default settings if initialization options is empty or not provided ([#11566](https://github.com/astral-sh/ruff/pull/11566))
- `ruff server` correctly treats `.pyi` files as stub files ([#11535](https://github.com/astral-sh/ruff/pull/11535))
- `ruff server` searches for configuration in parent directories ([#11537](https://github.com/astral-sh/ruff/pull/11537))
- `ruff server`: An empty code action filter no longer returns notebook source actions ([#11526](https://github.com/astral-sh/ruff/pull/11526))
### Bug fixes
- \[`flake8-logging-format`\] Fix autofix title in `logging-warn` (`G010`) ([#11514](https://github.com/astral-sh/ruff/pull/11514))
- \[`refurb`\] Avoid recommending `operator.itemgetter` with dependence on lambda arguments ([#11574](https://github.com/astral-sh/ruff/pull/11574))
- \[`flake8-simplify`\] Avoid recommending context manager in `__enter__` implementations ([#11575](https://github.com/astral-sh/ruff/pull/11575))
- Create intermediary directories for `--output-file` ([#11550](https://github.com/astral-sh/ruff/pull/11550))
- Propagate reads on global variables ([#11584](https://github.com/astral-sh/ruff/pull/11584))
- Treat all `singledispatch` arguments as runtime-required ([#11523](https://github.com/astral-sh/ruff/pull/11523))
## 0.4.5
### Ruff's language server is now in Beta
`v0.4.5` marks the official Beta release of `ruff server`, an integrated language server built into Ruff.
`ruff server` supports the same feature set as `ruff-lsp`, powering linting, formatting, and
code fixes in Ruff's editor integrations -- but with superior performance and
no installation required. We'd love your feedback!
You can enable `ruff server` in the [VS Code extension](https://github.com/astral-sh/ruff-vscode?tab=readme-ov-file#enabling-the-rust-based-language-server) today.
To read more about this exciting milestone, check out our [blog post](https://astral.sh/blog/ruff-v0.4.5)!
### Rule changes
- \[`flake8-future-annotations`\] Reword `future-rewritable-type-annotation` (`FA100`) message ([#11381](https://github.com/astral-sh/ruff/pull/11381))
- \[`isort`\] Expanded the set of standard-library modules to include `_string`, etc. ([#11374](https://github.com/astral-sh/ruff/pull/11374))
- \[`pycodestyle`\] Consider soft keywords for `E27` rules ([#11446](https://github.com/astral-sh/ruff/pull/11446))
- \[`pyflakes`\] Recommend adding unused import bindings to `__all__` ([#11314](https://github.com/astral-sh/ruff/pull/11314))
- \[`pyflakes`\] Update documentation and deprecate `ignore_init_module_imports` ([#11436](https://github.com/astral-sh/ruff/pull/11436))
- \[`pyupgrade`\] Mark quotes as unnecessary for non-evaluated annotations ([#11485](https://github.com/astral-sh/ruff/pull/11485))
### Formatter
- Avoid multiline quotes warning with `quote-style = preserve` ([#11490](https://github.com/astral-sh/ruff/pull/11490))
### Server
- Support Jupyter Notebook files ([#11206](https://github.com/astral-sh/ruff/pull/11206))
- Support `noqa` comment code actions ([#11276](https://github.com/astral-sh/ruff/pull/11276))
- Fix automatic configuration reloading ([#11492](https://github.com/astral-sh/ruff/pull/11492))
- Fix several issues with configuration in Neovim and Helix ([#11497](https://github.com/astral-sh/ruff/pull/11497))
### CLI
- Add `--output-format` as a CLI option for `ruff config` ([#11438](https://github.com/astral-sh/ruff/pull/11438))
### Bug fixes
- Avoid `PLE0237` for property with setter ([#11377](https://github.com/astral-sh/ruff/pull/11377))
- Avoid `TCH005` for `if` stmt with `elif`/`else` block ([#11376](https://github.com/astral-sh/ruff/pull/11376))
- Avoid flagging `__future__` annotations as required for non-evaluated type annotations ([#11414](https://github.com/astral-sh/ruff/pull/11414))
- Check for ruff executable in 'bin' directory as installed by 'pip install --target'. ([#11450](https://github.com/astral-sh/ruff/pull/11450))
- Sort edits prior to deduplicating in quotation fix ([#11452](https://github.com/astral-sh/ruff/pull/11452))
- Treat escaped newline as valid sequence ([#11465](https://github.com/astral-sh/ruff/pull/11465))
- \[`flake8-pie`\] Preserve parentheses in `unnecessary-dict-kwargs` ([#11372](https://github.com/astral-sh/ruff/pull/11372))
- \[`pylint`\] Ignore `__slots__` with dynamic values ([#11488](https://github.com/astral-sh/ruff/pull/11488))
- \[`pylint`\] Remove `try` body from branch counting ([#11487](https://github.com/astral-sh/ruff/pull/11487))
- \[`refurb`\] Respect operator precedence in `FURB110` ([#11464](https://github.com/astral-sh/ruff/pull/11464))
### Documentation
- Add `--preview` to the README ([#11395](https://github.com/astral-sh/ruff/pull/11395))
- Add Python 3.13 to list of allowed Python versions ([#11411](https://github.com/astral-sh/ruff/pull/11411))
- Simplify Neovim setup documentation ([#11489](https://github.com/astral-sh/ruff/pull/11489))
- Update CONTRIBUTING.md to reflect the new parser ([#11434](https://github.com/astral-sh/ruff/pull/11434))
- Update server documentation with new migration guide ([#11499](https://github.com/astral-sh/ruff/pull/11499))
- \[`pycodestyle`\] Clarify motivation for `E713` and `E714` ([#11483](https://github.com/astral-sh/ruff/pull/11483))
- \[`pyflakes`\] Update docs to describe WAI behavior (F541) ([#11362](https://github.com/astral-sh/ruff/pull/11362))
- \[`pylint`\] Clearly indicate what is counted as a branch ([#11423](https://github.com/astral-sh/ruff/pull/11423))
## 0.4.4
### Preview features
@@ -74,6 +286,10 @@
- Avoid allocations for isort module names ([#11251](https://github.com/astral-sh/ruff/pull/11251))
- Build a separate ARM wheel for macOS ([#11149](https://github.com/astral-sh/ruff/pull/11149))
### Windows
- Increase the minimum requirement to Windows 10.
## 0.4.2
### Rule changes

View File

@@ -101,6 +101,8 @@ pre-commit run --all-files --show-diff-on-failure # Rust and Python formatting,
These checks will run on GitHub Actions when you open your pull request, but running them locally
will save you time and expedite the merge process.
If you're using VS Code, you can also install the recommended [rust-analyzer](https://marketplace.visualstudio.com/items?itemName=rust-lang.rust-analyzer) extension to get these checks while editing.
Note that many code changes also require updating the snapshot tests, which is done interactively
after running `cargo test` like so:
@@ -349,7 +351,9 @@ even patch releases may contain [non-backwards-compatible changes](https://semve
- The commit hash of the merged release pull request on `main`
1. The release workflow will do the following:
1. Build all the assets. If this fails (even though we tested in step 4), we haven't tagged or
uploaded anything, you can restart after pushing a fix.
uploaded anything, you can restart after pushing a fix. If you just need to rerun the build,
make sure you're [re-running all the failed
jobs](https://docs.github.com/en/actions/managing-workflow-runs/re-running-workflows-and-jobs#re-running-failed-jobs-in-a-workflow) and not just a single failed job.
1. Upload to PyPI.
1. Create and push the Git tag (as extracted from `pyproject.toml`). We create the Git tag only
after building the wheels and uploading to PyPI, since we can't delete or modify the tag ([#4468](https://github.com/astral-sh/ruff/issues/4468)).

497
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -4,7 +4,7 @@ resolver = "2"
[workspace.package]
edition = "2021"
rust-version = "1.71"
rust-version = "1.75"
homepage = "https://docs.astral.sh/ruff"
documentation = "https://docs.astral.sh/ruff"
repository = "https://github.com/astral-sh/ruff"
@@ -14,6 +14,7 @@ license = "MIT"
[workspace.dependencies]
ruff = { path = "crates/ruff" }
ruff_cache = { path = "crates/ruff_cache" }
ruff_db = { path = "crates/ruff_db" }
ruff_diagnostics = { path = "crates/ruff_diagnostics" }
ruff_formatter = { path = "crates/ruff_formatter" }
ruff_index = { path = "crates/ruff_index" }
@@ -34,6 +35,8 @@ ruff_source_file = { path = "crates/ruff_source_file" }
ruff_text_size = { path = "crates/ruff_text_size" }
ruff_workspace = { path = "crates/ruff_workspace" }
red_knot_module_resolver = { path = "crates/red_knot_module_resolver" }
aho-corasick = { version = "1.1.3" }
annotate-snippets = { version = "0.9.2", features = ["color"] }
anyhow = { version = "1.0.80" }
@@ -42,6 +45,7 @@ bincode = { version = "1.3.3" }
bitflags = { version = "2.5.0" }
bstr = { version = "1.9.1" }
cachedir = { version = "0.3.1" }
camino = { version = "1.1.7" }
chrono = { version = "0.4.35", default-features = false, features = ["clock"] }
clap = { version = "4.5.3", features = ["derive"] }
clap_complete_command = { version = "0.5.1" }
@@ -62,26 +66,26 @@ filetime = { version = "0.2.23" }
glob = { version = "0.3.1" }
globset = { version = "0.4.14" }
hashbrown = "0.14.3"
hexf-parse = { version = "0.2.1" }
ignore = { version = "0.4.22" }
imara-diff = { version = "0.1.5" }
imperative = { version = "1.0.4" }
indexmap = { version = "2.2.6" }
indicatif = { version = "0.17.8" }
indoc = { version = "2.0.4" }
insta = { version = "1.35.1", feature = ["filters", "glob"] }
insta = { version = "1.35.1" }
insta-cmd = { version = "0.6.0" }
is-macro = { version = "0.3.5" }
is-wsl = { version = "0.4.0" }
itertools = { version = "0.12.1" }
itertools = { version = "0.13.0" }
js-sys = { version = "0.3.69" }
jod-thread = { version = "0.1.2" }
lexical-parse-float = { version = "0.8.0", features = ["format"] }
libc = { version = "0.2.153" }
libcst = { version = "1.1.0", default-features = false }
log = { version = "0.4.17" }
lsp-server = { version = "0.7.6" }
lsp-types = { version = "0.95.0", features = ["proposed"] }
lsp-types = { git = "https://github.com/astral-sh/lsp-types.git", rev = "3512a9f", features = [
"proposed",
] }
matchit = { version = "0.8.1" }
memchr = { version = "2.7.1" }
mimalloc = { version = "0.1.39" }
@@ -101,18 +105,21 @@ quote = { version = "1.0.23" }
rand = { version = "0.8.5" }
rayon = { version = "1.10.0" }
regex = { version = "1.10.2" }
result-like = { version = "0.5.0" }
rustc-hash = { version = "1.1.0" }
rustc-hash = { version = "2.0.0" }
salsa = { git = "https://github.com/salsa-rs/salsa.git", rev = "f706aa2d32d473ee633a77c1af01d180c85da308" }
schemars = { version = "0.8.16" }
seahash = { version = "4.1.0" }
serde = { version = "1.0.197", features = ["derive"] }
serde-wasm-bindgen = { version = "0.6.4" }
serde_json = { version = "1.0.113" }
serde_test = { version = "1.0.152" }
serde_with = { version = "3.6.0", default-features = false, features = ["macros"] }
serde_with = { version = "3.6.0", default-features = false, features = [
"macros",
] }
shellexpand = { version = "3.0.0" }
similar = { version = "2.4.0", features = ["inline"] }
smallvec = { version = "1.13.2" }
smol_str = { version = "0.2.2" }
static_assertions = "1.1.0"
strum = { version = "0.26.0", features = ["strum_macros"] }
strum_macros = { version = "0.26.0" }
@@ -134,11 +141,17 @@ unicode_names2 = { version = "1.2.2" }
unicode-normalization = { version = "0.1.23" }
ureq = { version = "2.9.6" }
url = { version = "2.5.0" }
uuid = { version = "1.6.1", features = ["v4", "fast-rng", "macro-diagnostics", "js"] }
uuid = { version = "1.6.1", features = [
"v4",
"fast-rng",
"macro-diagnostics",
"js",
] }
walkdir = { version = "2.3.2" }
wasm-bindgen = { version = "0.2.92" }
wasm-bindgen-test = { version = "0.3.42" }
wild = { version = "2" }
zip = { version = "0.6.6", default-features = false, features = ["zstd"] }
[workspace.lints.rust]
unsafe_code = "warn"

View File

@@ -28,7 +28,7 @@ An extremely fast Python linter and code formatter, written in Rust.
- ⚡️ 10-100x faster than existing linters (like Flake8) and formatters (like Black)
- 🐍 Installable via `pip`
- 🛠️ `pyproject.toml` support
- 🤝 Python 3.12 compatibility
- 🤝 Python 3.13 compatibility
- ⚖️ Drop-in parity with [Flake8](https://docs.astral.sh/ruff/faq/#how-does-ruff-compare-to-flake8), isort, and Black
- 📦 Built-in caching, to avoid re-analyzing unchanged files
- 🔧 Fix support, for automatic error correction (e.g., automatically remove unused imports)
@@ -152,7 +152,7 @@ Ruff can also be used as a [pre-commit](https://pre-commit.com/) hook via [`ruff
```yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.4.4
rev: v0.4.10
hooks:
# Run the linter.
- id: ruff
@@ -408,6 +408,7 @@ Ruff is used by a number of major open-source projects and companies, including:
- [Dagster](https://github.com/dagster-io/dagster)
- Databricks ([MLflow](https://github.com/mlflow/mlflow))
- [FastAPI](https://github.com/tiangolo/fastapi)
- [Godot](https://github.com/godotengine/godot)
- [Gradio](https://github.com/gradio-app/gradio)
- [Great Expectations](https://github.com/great-expectations/great_expectations)
- [HTTPX](https://github.com/encode/httpx)
@@ -433,6 +434,7 @@ Ruff is used by a number of major open-source projects and companies, including:
- Modern Treasury ([Python SDK](https://github.com/Modern-Treasury/modern-treasury-python))
- Mozilla ([Firefox](https://github.com/mozilla/gecko-dev))
- [Mypy](https://github.com/python/mypy)
- [Nautobot](https://github.com/nautobot/nautobot)
- Netflix ([Dispatch](https://github.com/Netflix/dispatch))
- [Neon](https://github.com/neondatabase/neon)
- [Nokia](https://nokia.com/)
@@ -440,6 +442,7 @@ Ruff is used by a number of major open-source projects and companies, including:
- [NumPyro](https://github.com/pyro-ppl/numpyro)
- [ONNX](https://github.com/onnx/onnx)
- [OpenBB](https://github.com/OpenBB-finance/OpenBBTerminal)
- [Open Wine Components](https://github.com/Open-Wine-Components/umu-launcher)
- [PDM](https://github.com/pdm-project/pdm)
- [PaddlePaddle](https://github.com/PaddlePaddle/Paddle)
- [Pandas](https://github.com/pandas-dev/pandas)
@@ -467,6 +470,7 @@ Ruff is used by a number of major open-source projects and companies, including:
- [Sphinx](https://github.com/sphinx-doc/sphinx)
- [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3)
- [Starlette](https://github.com/encode/starlette)
- [Streamlit](https://github.com/streamlit/streamlit)
- [The Algorithms](https://github.com/TheAlgorithms/Python)
- [Vega-Altair](https://github.com/altair-viz/altair)
- WordPress ([Openverse](https://github.com/WordPress/openverse))

View File

@@ -1,6 +1,6 @@
[files]
# https://github.com/crate-ci/typos/issues/868
extend-exclude = ["crates/red_knot/vendor/**/*", "**/resources/**/*", "**/snapshots/**/*"]
extend-exclude = ["crates/red_knot_module_resolver/vendor/**/*", "**/resources/**/*", "**/snapshots/**/*"]
[default.extend-words]
"arange" = "arange" # e.g. `numpy.arange`

View File

@@ -12,6 +12,8 @@ license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
red_knot_module_resolver = { workspace = true }
ruff_python_parser = { workspace = true }
ruff_python_ast = { workspace = true }
ruff_text_size = { workspace = true }
@@ -25,6 +27,7 @@ ctrlc = { version = "3.4.4" }
dashmap = { workspace = true }
hashbrown = { workspace = true }
indexmap = { workspace = true }
is-macro = { workspace = true }
notify = { workspace = true }
parking_lot = { workspace = true }
rayon = { workspace = true }
@@ -35,7 +38,6 @@ tracing-subscriber = { workspace = true }
tracing-tree = { workspace = true }
[dev-dependencies]
textwrap = { version = "0.16.1" }
tempfile = { workspace = true }
[lints]

View File

@@ -1,9 +0,0 @@
# Red Knot
The Red Knot crate contains code working towards multifile analysis, type inference and, ultimately, type-checking. It's very much a work in progress for now.
## Vendored types for the stdlib
Red Knot vendors [typeshed](https://github.com/python/typeshed)'s stubs for the standard library. The vendored stubs can be found in `crates/red_knot/vendor/typeshed`. The file `crates/red_knot/vendor/typeshed/source_commit.txt` tells you the typeshed commit that our vendored stdlib stubs currently correspond to.
The typeshed stubs are updated every two weeks via an automated PR using the `sync_typeshed.yaml` workflow in the `.github/workflows` directory. This workflow can also be triggered at any time via [workflow dispatch](https://docs.github.com/en/actions/using-workflows/manually-running-a-workflow#running-a-workflow).

View File

@@ -6,8 +6,8 @@ use std::marker::PhantomData;
use rustc_hash::FxHashMap;
use ruff_index::{Idx, IndexVec};
use ruff_python_ast::visitor::preorder;
use ruff_python_ast::visitor::preorder::{PreorderVisitor, TraversalSignal};
use ruff_python_ast::visitor::source_order;
use ruff_python_ast::visitor::source_order::{SourceOrderVisitor, TraversalSignal};
use ruff_python_ast::{
AnyNodeRef, AstNode, ExceptHandler, ExceptHandlerExceptHandler, Expr, MatchCase, ModModule,
NodeKind, Parameter, Stmt, StmtAnnAssign, StmtAssign, StmtAugAssign, StmtClassDef,
@@ -91,9 +91,9 @@ impl AstIds {
while let Some(deferred) = visitor.deferred.pop() {
match deferred {
DeferredNode::FunctionDefinition(def) => {
def.visit_preorder(&mut visitor);
def.visit_source_order(&mut visitor);
}
DeferredNode::ClassDefinition(def) => def.visit_preorder(&mut visitor),
DeferredNode::ClassDefinition(def) => def.visit_source_order(&mut visitor),
}
}
@@ -182,7 +182,7 @@ impl<'a> AstIdsVisitor<'a> {
}
}
impl<'a> PreorderVisitor<'a> for AstIdsVisitor<'a> {
impl<'a> SourceOrderVisitor<'a> for AstIdsVisitor<'a> {
fn visit_stmt(&mut self, stmt: &'a Stmt) {
match stmt {
Stmt::FunctionDef(def) => {
@@ -226,14 +226,14 @@ impl<'a> PreorderVisitor<'a> for AstIdsVisitor<'a> {
Stmt::IpyEscapeCommand(_) => {}
}
preorder::walk_stmt(self, stmt);
source_order::walk_stmt(self, stmt);
}
fn visit_expr(&mut self, _expr: &'a Expr) {}
fn visit_parameter(&mut self, parameter: &'a Parameter) {
self.create_id(parameter);
preorder::walk_parameter(self, parameter);
source_order::walk_parameter(self, parameter);
}
fn visit_except_handler(&mut self, except_handler: &'a ExceptHandler) {
@@ -243,17 +243,17 @@ impl<'a> PreorderVisitor<'a> for AstIdsVisitor<'a> {
}
}
preorder::walk_except_handler(self, except_handler);
source_order::walk_except_handler(self, except_handler);
}
fn visit_with_item(&mut self, with_item: &'a WithItem) {
self.create_id(with_item);
preorder::walk_with_item(self, with_item);
source_order::walk_with_item(self, with_item);
}
fn visit_match_case(&mut self, match_case: &'a MatchCase) {
self.create_id(match_case);
preorder::walk_match_case(self, match_case);
source_order::walk_match_case(self, match_case);
}
fn visit_type_param(&mut self, type_param: &'a TypeParam) {
@@ -275,10 +275,7 @@ pub struct TypedNodeKey<N: AstNode> {
impl<N: AstNode> TypedNodeKey<N> {
pub fn from_node(node: &N) -> Self {
let inner = NodeKey {
kind: node.as_any_node_ref().kind(),
range: node.range(),
};
let inner = NodeKey::from_node(node.as_any_node_ref());
Self {
inner,
_marker: PhantomData,
@@ -312,7 +309,7 @@ struct FindNodeKeyVisitor<'a> {
result: Option<AnyNodeRef<'a>>,
}
impl<'a> PreorderVisitor<'a> for FindNodeKeyVisitor<'a> {
impl<'a> SourceOrderVisitor<'a> for FindNodeKeyVisitor<'a> {
fn enter_node(&mut self, node: AnyNodeRef<'a>) -> TraversalSignal {
if self.result.is_some() {
return TraversalSignal::Skip;
@@ -352,6 +349,12 @@ pub struct NodeKey {
}
impl NodeKey {
pub fn from_node(node: AnyNodeRef) -> Self {
NodeKey {
kind: node.kind(),
range: node.range(),
}
}
pub fn resolve<'a>(&self, root: AnyNodeRef<'a>) -> Option<AnyNodeRef<'a>> {
// We need to do a binary search here. Only traverse into a node if the range is withint the node
let mut visitor = FindNodeKeyVisitor {

View File

@@ -9,9 +9,9 @@ use crate::files::FileId;
use crate::lint::{LintSemanticStorage, LintSyntaxStorage};
use crate::module::ModuleResolver;
use crate::parse::ParsedStorage;
use crate::semantic::SemanticIndexStorage;
use crate::semantic::TypeStore;
use crate::source::SourceStorage;
use crate::symbols::SymbolTablesStorage;
use crate::types::TypeStore;
mod jars;
mod query;
@@ -125,7 +125,7 @@ pub struct SourceJar {
#[derive(Debug, Default)]
pub struct SemanticJar {
pub module_resolver: ModuleResolver,
pub symbol_tables: SymbolTablesStorage,
pub semantic_indices: SemanticIndexStorage,
pub type_store: TypeStore,
}

View File

@@ -17,9 +17,8 @@ pub mod lint;
pub mod module;
mod parse;
pub mod program;
mod semantic;
pub mod source;
mod symbols;
mod types;
pub mod watch;
pub(crate) type FxDashMap<K, V> = dashmap::DashMap<K, V, BuildHasherDefault<FxHasher>>;

View File

@@ -5,17 +5,18 @@ use std::time::Duration;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::{ModModule, StringLiteral};
use ruff_python_parser::Parsed;
use crate::cache::KeyValueCache;
use crate::db::{LintDb, LintJar, QueryResult};
use crate::files::FileId;
use crate::module::ModuleName;
use crate::parse::{parse, Parsed};
use crate::source::{source_text, Source};
use crate::symbols::{
resolve_global_symbol, symbol_table, Definition, GlobalSymbolId, SymbolId, SymbolTable,
use crate::module::{resolve_module, ModuleName};
use crate::parse::parse;
use crate::semantic::{infer_definition_type, infer_symbol_public_type, Type};
use crate::semantic::{
resolve_global_symbol, semantic_index, Definition, GlobalSymbolId, SemanticIndex, SymbolId,
};
use crate::types::{infer_definition_type, infer_symbol_type, Type};
use crate::source::{source_text, Source};
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn lint_syntax(db: &dyn LintDb, file_id: FileId) -> QueryResult<Diagnostics> {
@@ -40,7 +41,7 @@ pub(crate) fn lint_syntax(db: &dyn LintDb, file_id: FileId) -> QueryResult<Diagn
let parsed = parse(db.upcast(), *file_id)?;
if parsed.errors().is_empty() {
let ast = parsed.ast();
let ast = parsed.syntax();
let mut visitor = SyntaxLintVisitor {
diagnostics,
@@ -81,13 +82,13 @@ pub(crate) fn lint_semantic(db: &dyn LintDb, file_id: FileId) -> QueryResult<Dia
storage.get(&file_id, |file_id| {
let source = source_text(db.upcast(), *file_id)?;
let parsed = parse(db.upcast(), *file_id)?;
let symbols = symbol_table(db.upcast(), *file_id)?;
let semantic_index = semantic_index(db.upcast(), *file_id)?;
let context = SemanticLintContext {
file_id: *file_id,
source,
parsed,
symbols,
parsed: &parsed,
semantic_index,
db,
diagnostics: RefCell::new(Vec::new()),
};
@@ -101,17 +102,17 @@ pub(crate) fn lint_semantic(db: &dyn LintDb, file_id: FileId) -> QueryResult<Dia
fn lint_unresolved_imports(context: &SemanticLintContext) -> QueryResult<()> {
// TODO: Consider iterating over the dependencies (imports) only instead of all definitions.
for (symbol, definition) in context.symbols().all_definitions() {
for (symbol, definition) in context.semantic_index().symbol_table().all_definitions() {
match definition {
Definition::Import(import) => {
let ty = context.infer_symbol_type(symbol)?;
let ty = context.infer_symbol_public_type(symbol)?;
if ty.is_unknown() {
context.push_diagnostic(format!("Unresolved module {}", import.module));
}
}
Definition::ImportFrom(import) => {
let ty = context.infer_symbol_type(symbol)?;
let ty = context.infer_symbol_public_type(symbol)?;
if ty.is_unknown() {
let module_name = import.module().map(Deref::deref).unwrap_or_default();
@@ -144,16 +145,14 @@ fn lint_bad_overrides(context: &SemanticLintContext) -> QueryResult<()> {
// TODO we should have a special marker on the real typing module (from typeshed) so if you
// have your own "typing" module in your project, we don't consider it THE typing module (and
// same for other stdlib modules that our lint rules care about)
let Some(typing_override) =
resolve_global_symbol(context.db.upcast(), ModuleName::new("typing"), "override")?
else {
let Some(typing_override) = context.resolve_global_symbol("typing", "override")? else {
// TODO once we bundle typeshed, this should be unreachable!()
return Ok(());
};
// TODO we should maybe index definitions by type instead of iterating all, or else iterate all
// just once, match, and branch to all lint rules that care about a type of definition
for (symbol, definition) in context.symbols().all_definitions() {
for (symbol, definition) in context.semantic_index().symbol_table().all_definitions() {
if !matches!(definition, Definition::FunctionDef(_)) {
continue;
}
@@ -194,8 +193,8 @@ fn lint_bad_overrides(context: &SemanticLintContext) -> QueryResult<()> {
pub struct SemanticLintContext<'a> {
file_id: FileId,
source: Source,
parsed: Parsed,
symbols: Arc<SymbolTable>,
parsed: &'a Parsed<ModModule>,
semantic_index: Arc<SemanticIndex>,
db: &'a dyn LintDb,
diagnostics: RefCell<Vec<String>>,
}
@@ -209,16 +208,16 @@ impl<'a> SemanticLintContext<'a> {
self.file_id
}
pub fn ast(&self) -> &ModModule {
self.parsed.ast()
pub fn ast(&self) -> &'a ModModule {
self.parsed.syntax()
}
pub fn symbols(&self) -> &SymbolTable {
&self.symbols
pub fn semantic_index(&self) -> &SemanticIndex {
&self.semantic_index
}
pub fn infer_symbol_type(&self, symbol_id: SymbolId) -> QueryResult<Type> {
infer_symbol_type(
pub fn infer_symbol_public_type(&self, symbol_id: SymbolId) -> QueryResult<Type> {
infer_symbol_public_type(
self.db.upcast(),
GlobalSymbolId {
file_id: self.file_id,
@@ -234,6 +233,18 @@ impl<'a> SemanticLintContext<'a> {
pub fn extend_diagnostics(&mut self, diagnostics: impl IntoIterator<Item = String>) {
self.diagnostics.get_mut().extend(diagnostics);
}
pub fn resolve_global_symbol(
&self,
module: &str,
symbol_name: &str,
) -> QueryResult<Option<GlobalSymbolId>> {
let Some(module) = resolve_module(self.db.upcast(), ModuleName::new(module))? else {
return Ok(None);
};
resolve_global_symbol(self.db.upcast(), module, symbol_name)
}
}
#[derive(Debug)]

View File

@@ -12,7 +12,7 @@ use tracing_subscriber::{Layer, Registry};
use tracing_tree::time::Uptime;
use red_knot::db::{HasJar, ParallelDatabase, QueryError, SourceDb, SourceJar};
use red_knot::module::{set_module_search_paths, ModuleSearchPath, ModuleSearchPathKind};
use red_knot::module::{set_module_search_paths, ModuleResolutionInputs};
use red_knot::program::check::ExecutionMode;
use red_knot::program::{FileWatcherChange, Program};
use red_knot::watch::FileWatcher;
@@ -44,12 +44,17 @@ fn main() -> anyhow::Result<()> {
let workspace_folder = entry_point.parent().unwrap();
let workspace = Workspace::new(workspace_folder.to_path_buf());
let workspace_search_path = ModuleSearchPath::new(
workspace.root().to_path_buf(),
ModuleSearchPathKind::FirstParty,
);
let workspace_search_path = workspace.root().to_path_buf();
let search_paths = ModuleResolutionInputs {
extra_paths: vec![],
workspace_root: workspace_search_path,
site_packages: None,
custom_typeshed: None,
};
let mut program = Program::new(workspace);
set_module_search_paths(&mut program, vec![workspace_search_path]);
set_module_search_paths(&mut program, search_paths);
let entry_id = program.file_id(entry_point);
program.workspace_mut().open_file(entry_id);

View File

@@ -7,16 +7,22 @@ use std::sync::Arc;
use dashmap::mapref::entry::Entry;
use smol_str::SmolStr;
use red_knot_module_resolver::ModuleKind;
use crate::db::{QueryResult, SemanticDb, SemanticJar};
use crate::files::FileId;
use crate::symbols::Dependency;
use crate::semantic::Dependency;
use crate::FxDashMap;
/// ID uniquely identifying a module.
/// Representation of a Python module.
///
/// The inner type wrapped by this struct is a unique identifier for the module
/// that is used by the struct's methods to lazily query information about the module.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct Module(u32);
impl Module {
/// Return the absolute name of the module (e.g. `foo.bar`)
pub fn name(&self, db: &dyn SemanticDb) -> QueryResult<ModuleName> {
let jar: &SemanticJar = db.jar()?;
let modules = &jar.module_resolver;
@@ -24,6 +30,7 @@ impl Module {
Ok(modules.modules.get(self).unwrap().name.clone())
}
/// Return the path to the source code that defines this module
pub fn path(&self, db: &dyn SemanticDb) -> QueryResult<ModulePath> {
let jar: &SemanticJar = db.jar()?;
let modules = &jar.module_resolver;
@@ -31,6 +38,7 @@ impl Module {
Ok(modules.modules.get(self).unwrap().path.clone())
}
/// Determine whether this module is a single-file module or a package
pub fn kind(&self, db: &dyn SemanticDb) -> QueryResult<ModuleKind> {
let jar: &SemanticJar = db.jar()?;
let modules = &jar.module_resolver;
@@ -38,6 +46,16 @@ impl Module {
Ok(modules.modules.get(self).unwrap().kind)
}
/// Attempt to resolve a dependency of this module to an absolute [`ModuleName`].
///
/// A dependency could be either absolute (e.g. the `foo` dependency implied by `from foo import bar`)
/// or relative to this module (e.g. the `.foo` dependency implied by `from .foo import bar`)
///
/// - Returns an error if the query failed.
/// - Returns `Ok(None)` if the query succeeded,
/// but the dependency refers to a module that does not exist.
/// - Returns `Ok(Some(ModuleName))` if the query succeeded,
/// and the dependency refers to a module that exists.
pub fn resolve_dependency(
&self,
db: &dyn SemanticDb,
@@ -87,7 +105,8 @@ impl Module {
/// A module name, e.g. `foo.bar`.
///
/// Always normalized to the absolute form (never a relative module name).
/// Always normalized to the absolute form
/// (never a relative module name, i.e., never `.foo`).
#[derive(Clone, Debug, Eq, PartialEq, Hash)]
pub struct ModuleName(smol_str::SmolStr);
@@ -124,10 +143,13 @@ impl ModuleName {
Some(Self(name))
}
/// An iterator over the components of the module name:
/// `foo.bar.baz` -> `foo`, `bar`, `baz`
pub fn components(&self) -> impl DoubleEndedIterator<Item = &str> {
self.0.split('.')
}
/// The name of this module's immediate parent, if it has a parent
pub fn parent(&self) -> Option<ModuleName> {
let (_, parent) = self.0.rsplit_once('.')?;
@@ -157,14 +179,6 @@ impl std::fmt::Display for ModuleName {
}
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub enum ModuleKind {
Module,
/// A python package (a `__init__.py` or `__init__.pyi` file)
Package,
}
/// A search path in which to search modules.
/// Corresponds to a path in [`sys.path`](https://docs.python.org/3/library/sys_path_init.html) at runtime.
///
@@ -181,10 +195,12 @@ impl ModuleSearchPath {
}
}
/// Determine whether this is a first-party, third-party or standard-library search path
pub fn kind(&self) -> ModuleSearchPathKind {
self.inner.kind
}
/// Return the location of the search path on the file system
pub fn path(&self) -> &Path {
&self.inner.path
}
@@ -202,22 +218,31 @@ struct ModuleSearchPathInner {
kind: ModuleSearchPathKind,
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
/// Enumeration of the different kinds of search paths type checkers are expected to support.
///
/// N.B. Although we don't implement `Ord` for this enum, they are ordered in terms of the
/// priority that we want to give these modules when resolving them.
/// This is roughly [the order given in the typing spec], but typeshed's stubs
/// for the standard library are moved higher up to match Python's semantics at runtime.
///
/// [the order given in the typing spec]: https://typing.readthedocs.io/en/latest/spec/distributing.html#import-resolution-ordering
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash, is_macro::Is)]
pub enum ModuleSearchPathKind {
// Project dependency
/// "Extra" paths provided by the user in a config file, env var or CLI flag.
/// E.g. mypy's `MYPYPATH` env var, or pyright's `stubPath` configuration setting
Extra,
/// Files in the project we're directly being invoked on
FirstParty,
// e.g. site packages
ThirdParty,
// e.g. built-in modules, typeshed
/// The `stdlib` directory of typeshed (either vendored or custom)
StandardLibrary,
}
impl ModuleSearchPathKind {
pub const fn is_first_party(self) -> bool {
matches!(self, Self::FirstParty)
}
/// Stubs or runtime modules installed in site-packages
SitePackagesThirdParty,
/// Vendored third-party stubs from typeshed
VendoredThirdParty,
}
#[derive(Debug, Eq, PartialEq)]
@@ -231,9 +256,11 @@ pub struct ModuleData {
// Queries
//////////////////////////////////////////////////////
/// Resolves a module name to a module id
/// TODO: This would not work with Salsa because `ModuleName` isn't an ingredient and, therefore, cannot be used as part of a query.
/// For this to work with salsa, it would be necessary to intern all `ModuleName`s.
/// Resolves a module name to a module.
///
/// TODO: This would not work with Salsa because `ModuleName` isn't an ingredient
/// and, therefore, cannot be used as part of a query.
/// For this to work with salsa, it would be necessary to intern all `ModuleName`s.
#[tracing::instrument(level = "debug", skip(db))]
pub fn resolve_module(db: &dyn SemanticDb, name: ModuleName) -> QueryResult<Option<Module>> {
let jar: &SemanticJar = db.jar()?;
@@ -255,7 +282,7 @@ pub fn resolve_module(db: &dyn SemanticDb, name: ModuleName) -> QueryResult<Opti
let file_id = db.file_id(&normalized);
let path = ModulePath::new(root_path.clone(), file_id);
let id = Module(
let module = Module(
modules
.next_module_id
.fetch_add(1, std::sync::atomic::Ordering::Relaxed),
@@ -263,7 +290,7 @@ pub fn resolve_module(db: &dyn SemanticDb, name: ModuleName) -> QueryResult<Opti
modules
.modules
.insert(id, Arc::from(ModuleData { name, path, kind }));
.insert(module, Arc::from(ModuleData { name, path, kind }));
// A path can map to multiple modules because of symlinks:
// ```
@@ -272,33 +299,33 @@ pub fn resolve_module(db: &dyn SemanticDb, name: ModuleName) -> QueryResult<Opti
// ```
// Here, both `foo` and `bar` resolve to the same module but through different paths.
// That's why we need to insert the absolute path and not the normalized path here.
let absolute_id = if absolute_path == normalized {
let absolute_file_id = if absolute_path == normalized {
file_id
} else {
db.file_id(&absolute_path)
};
modules.by_file.insert(absolute_id, id);
modules.by_file.insert(absolute_file_id, module);
entry.insert_entry(id);
entry.insert_entry(module);
Ok(Some(id))
Ok(Some(module))
}
}
}
/// Resolves the module id for the given path.
/// Resolves the module for the given path.
///
/// Returns `None` if the path is not a module in `sys.path`.
/// Returns `None` if the path is not a module locatable via `sys.path`.
#[tracing::instrument(level = "debug", skip(db))]
pub fn path_to_module(db: &dyn SemanticDb, path: &Path) -> QueryResult<Option<Module>> {
let file = db.file_id(path);
file_to_module(db, file)
}
/// Resolves the module id for the file with the given id.
/// Resolves the module for the file with the given id.
///
/// Returns `None` if the file is not a module in `sys.path`.
/// Returns `None` if the file is not a module locatable via `sys.path`.
#[tracing::instrument(level = "debug", skip(db))]
pub fn file_to_module(db: &dyn SemanticDb, file: FileId) -> QueryResult<Option<Module>> {
let jar: &SemanticJar = db.jar()?;
@@ -325,12 +352,12 @@ pub fn file_to_module(db: &dyn SemanticDb, file: FileId) -> QueryResult<Option<M
// Resolve the module name to see if Python would resolve the name to the same path.
// If it doesn't, then that means that multiple modules have the same in different
// root paths, but that the module corresponding to the past path is in a lower priority path,
// root paths, but that the module corresponding to the past path is in a lower priority search path,
// in which case we ignore it.
let Some(module_id) = resolve_module(db, module_name)? else {
let Some(module) = resolve_module(db, module_name)? else {
return Ok(None);
};
let module_path = module_id.path(db)?;
let module_path = module.path(db)?;
if module_path.root() == &root_path {
let Ok(normalized) = path.canonicalize() else {
@@ -350,7 +377,7 @@ pub fn file_to_module(db: &dyn SemanticDb, file: FileId) -> QueryResult<Option<M
}
// Path has been inserted by `resolved`
Ok(Some(module_id))
Ok(Some(module))
} else {
// This path is for a module with the same name but in a module search path with a lower priority.
// Ignore it.
@@ -363,25 +390,100 @@ pub fn file_to_module(db: &dyn SemanticDb, file: FileId) -> QueryResult<Option<M
//////////////////////////////////////////////////////
/// Changes the module search paths to `search_paths`.
pub fn set_module_search_paths(db: &mut dyn SemanticDb, search_paths: Vec<ModuleSearchPath>) {
pub fn set_module_search_paths(db: &mut dyn SemanticDb, search_paths: ModuleResolutionInputs) {
let jar: &mut SemanticJar = db.jar_mut();
jar.module_resolver = ModuleResolver::new(search_paths);
jar.module_resolver = ModuleResolver::new(search_paths.into_ordered_search_paths());
}
/// Adds a module to the resolver.
/// Struct for holding the various paths that are put together
/// to create an `OrderedSearchPatsh` instance
///
/// - `extra_paths` is a list of user-provided paths
/// that should take first priority in the module resolution.
/// Examples in other type checkers are mypy's MYPYPATH environment variable,
/// or pyright's stubPath configuration setting.
/// - `workspace_root` is the root of the workspace,
/// used for finding first-party modules
/// - `site-packages` is the path to the user's `site-packages` directory,
/// where third-party packages from ``PyPI`` are installed
/// - `custom_typeshed` is a path to standard-library typeshed stubs.
/// Currently this has to be a directory that exists on disk.
/// (TODO: fall back to vendored stubs if no custom directory is provided.)
#[derive(Debug)]
pub struct ModuleResolutionInputs {
pub extra_paths: Vec<PathBuf>,
pub workspace_root: PathBuf,
pub site_packages: Option<PathBuf>,
pub custom_typeshed: Option<PathBuf>,
}
impl ModuleResolutionInputs {
/// Implementation of PEP 561's module resolution order
/// (with some small, deliberate, differences)
fn into_ordered_search_paths(self) -> OrderedSearchPaths {
let ModuleResolutionInputs {
extra_paths,
workspace_root,
site_packages,
custom_typeshed,
} = self;
OrderedSearchPaths(
extra_paths
.into_iter()
.map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra))
.chain(std::iter::once(ModuleSearchPath::new(
workspace_root,
ModuleSearchPathKind::FirstParty,
)))
// TODO fallback to vendored typeshed stubs if no custom typeshed directory is provided by the user
.chain(custom_typeshed.into_iter().map(|path| {
ModuleSearchPath::new(
path.join(TYPESHED_STDLIB_DIRECTORY),
ModuleSearchPathKind::StandardLibrary,
)
}))
.chain(site_packages.into_iter().map(|path| {
ModuleSearchPath::new(path, ModuleSearchPathKind::SitePackagesThirdParty)
}))
// TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step
.collect(),
)
}
}
const TYPESHED_STDLIB_DIRECTORY: &str = "stdlib";
/// A resolved module resolution order, implementing PEP 561
/// (with some small, deliberate differences)
#[derive(Clone, Debug, Default, Eq, PartialEq)]
struct OrderedSearchPaths(Vec<ModuleSearchPath>);
impl Deref for OrderedSearchPaths {
type Target = [ModuleSearchPath];
fn deref(&self) -> &Self::Target {
&self.0
}
}
/// Adds a module located at `path` to the resolver.
///
/// Returns `None` if the path doesn't resolve to a module.
///
/// Returns `Some` with the id of the module and the ids of the modules that need re-resolving
/// because they were part of a namespace package and might now resolve differently.
/// Returns `Some(module, other_modules)`, where `module` is the resolved module
/// with file location `path`, and `other_modules` is a `Vec` of `ModuleData` instances.
/// Each element in `other_modules` provides information regarding a single module that needs
/// re-resolving because it was part of a namespace package and might now resolve differently.
///
/// Note: This won't work with salsa because `Path` is not an ingredient.
pub fn add_module(db: &mut dyn SemanticDb, path: &Path) -> Option<(Module, Vec<Arc<ModuleData>>)> {
// No locking is required because we're holding a mutable reference to `modules`.
// TODO This needs tests
// Note: Intentionally by-pass caching here. Module should not be in the cache yet.
// Note: Intentionally bypass caching here. Module should not be in the cache yet.
let module = path_to_module(db, path).ok()??;
// The code below is to handle the addition of `__init__.py` files.
@@ -405,15 +507,15 @@ pub fn add_module(db: &mut dyn SemanticDb, path: &Path) -> Option<(Module, Vec<A
let jar: &mut SemanticJar = db.jar_mut();
let modules = &mut jar.module_resolver;
modules.by_file.retain(|_, id| {
modules.by_file.retain(|_, module| {
if modules
.modules
.get(id)
.get(module)
.unwrap()
.name
.starts_with(&parent_name)
{
to_remove.push(*id);
to_remove.push(*module);
false
} else {
true
@@ -422,8 +524,8 @@ pub fn add_module(db: &mut dyn SemanticDb, path: &Path) -> Option<(Module, Vec<A
// TODO remove need for this vec
let mut removed = Vec::with_capacity(to_remove.len());
for id in &to_remove {
removed.push(modules.remove_module_by_id(*id));
for module in &to_remove {
removed.push(modules.remove_module(*module));
}
Some((module, removed))
@@ -432,14 +534,14 @@ pub fn add_module(db: &mut dyn SemanticDb, path: &Path) -> Option<(Module, Vec<A
#[derive(Default)]
pub struct ModuleResolver {
/// The search paths where modules are located (and searched). Corresponds to `sys.path` at runtime.
search_paths: Vec<ModuleSearchPath>,
search_paths: OrderedSearchPaths,
// Locking: Locking is done by acquiring a (write) lock on `by_name`. This is because `by_name` is the primary
// lookup method. Acquiring locks in any other ordering can result in deadlocks.
/// Resolves a module name to it's module id.
/// Looks up a module by name
by_name: FxDashMap<ModuleName, Module>,
/// All known modules, indexed by the module id.
/// A map of all known modules to data about those modules
modules: FxDashMap<Module, Arc<ModuleData>>,
/// Lookup from absolute path to module.
@@ -449,7 +551,7 @@ pub struct ModuleResolver {
}
impl ModuleResolver {
pub fn new(search_paths: Vec<ModuleSearchPath>) -> Self {
fn new(search_paths: OrderedSearchPaths) -> Self {
Self {
search_paths,
modules: FxDashMap::default(),
@@ -459,24 +561,27 @@ impl ModuleResolver {
}
}
pub(crate) fn remove_module(&mut self, file_id: FileId) {
/// Remove a module from the inner cache
pub(crate) fn remove_module_by_file(&mut self, file_id: FileId) {
// No locking is required because we're holding a mutable reference to `self`.
let Some((_, id)) = self.by_file.remove(&file_id) else {
let Some((_, module)) = self.by_file.remove(&file_id) else {
return;
};
self.remove_module_by_id(id);
self.remove_module(module);
}
fn remove_module_by_id(&mut self, id: Module) -> Arc<ModuleData> {
let (_, module) = self.modules.remove(&id).unwrap();
fn remove_module(&mut self, module: Module) -> Arc<ModuleData> {
let (_, module_data) = self.modules.remove(&module).unwrap();
self.by_name.remove(&module.name).unwrap();
self.by_name.remove(&module_data.name).unwrap();
// It's possible that multiple paths map to the same id. Search all other paths referencing the same module id.
self.by_file.retain(|_, current_id| *current_id != id);
// It's possible that multiple paths map to the same module.
// Search all other paths referencing the same module.
self.by_file
.retain(|_, current_module| *current_module != module);
module
module_data
}
}
@@ -505,15 +610,19 @@ impl ModulePath {
Self { root, file_id }
}
/// The search path that was used to locate the module
pub fn root(&self) -> &ModuleSearchPath {
&self.root
}
/// The file containing the source code for the module
pub fn file(&self) -> FileId {
self.file_id
}
}
/// Given a module name and a list of search paths in which to lookup modules,
/// attempt to resolve the module name
fn resolve_name(
name: &ModuleName,
search_paths: &[ModuleSearchPath],
@@ -635,7 +744,9 @@ enum PackageKind {
/// A root package or module. E.g. `foo` in `foo.bar.baz` or just `foo`.
Root,
/// A regular sub-package where the parent contains an `__init__.py`. For example `bar` in `foo.bar` when the `foo` directory contains an `__init__.py`.
/// A regular sub-package where the parent contains an `__init__.py`.
///
/// For example, `bar` in `foo.bar` when the `foo` directory contains an `__init__.py`.
Regular,
/// A sub-package in a namespace package. A namespace package is a package without an `__init__.py`.
@@ -653,21 +764,23 @@ impl PackageKind {
#[cfg(test)]
mod tests {
use std::num::NonZeroU32;
use std::path::PathBuf;
use crate::db::tests::TestDb;
use crate::db::SourceDb;
use crate::module::{
path_to_module, resolve_module, set_module_search_paths, ModuleKind, ModuleName,
ModuleSearchPath, ModuleSearchPathKind,
ModuleResolutionInputs, TYPESHED_STDLIB_DIRECTORY,
};
use crate::symbols::Dependency;
use crate::semantic::Dependency;
struct TestCase {
temp_dir: tempfile::TempDir,
db: TestDb,
src: ModuleSearchPath,
site_packages: ModuleSearchPath,
src: PathBuf,
custom_typeshed: PathBuf,
site_packages: PathBuf,
}
fn create_resolver() -> std::io::Result<TestCase> {
@@ -675,25 +788,31 @@ mod tests {
let src = temp_dir.path().join("src");
let site_packages = temp_dir.path().join("site_packages");
let custom_typeshed = temp_dir.path().join("typeshed");
std::fs::create_dir(&src)?;
std::fs::create_dir(&site_packages)?;
std::fs::create_dir(&custom_typeshed)?;
let src = ModuleSearchPath::new(src.canonicalize()?, ModuleSearchPathKind::FirstParty);
let site_packages = ModuleSearchPath::new(
site_packages.canonicalize()?,
ModuleSearchPathKind::ThirdParty,
);
let src = src.canonicalize()?;
let site_packages = site_packages.canonicalize()?;
let custom_typeshed = custom_typeshed.canonicalize()?;
let roots = vec![src.clone(), site_packages.clone()];
let search_paths = ModuleResolutionInputs {
extra_paths: vec![],
workspace_root: src.clone(),
site_packages: Some(site_packages.clone()),
custom_typeshed: Some(custom_typeshed.clone()),
};
let mut db = TestDb::default();
set_module_search_paths(&mut db, roots);
set_module_search_paths(&mut db, search_paths);
Ok(TestCase {
temp_dir,
db,
src,
custom_typeshed,
site_packages,
})
}
@@ -707,7 +826,7 @@ mod tests {
..
} = create_resolver()?;
let foo_path = src.path().join("foo.py");
let foo_path = src.join("foo.py");
std::fs::write(&foo_path, "print('Hello, world!')")?;
let foo_module = resolve_module(&db, ModuleName::new("foo"))?.unwrap();
@@ -718,7 +837,7 @@ mod tests {
);
assert_eq!(ModuleName::new("foo"), foo_module.name(&db)?);
assert_eq!(&src, foo_module.path(&db)?.root());
assert_eq!(&src, foo_module.path(&db)?.root().path());
assert_eq!(ModuleKind::Module, foo_module.kind(&db)?);
assert_eq!(&foo_path, &*db.file_path(foo_module.path(&db)?.file()));
@@ -727,6 +846,76 @@ mod tests {
Ok(())
}
#[test]
fn stdlib() -> anyhow::Result<()> {
let TestCase {
db,
custom_typeshed,
..
} = create_resolver()?;
let stdlib_dir = custom_typeshed.join(TYPESHED_STDLIB_DIRECTORY);
std::fs::create_dir_all(&stdlib_dir).unwrap();
let functools_path = stdlib_dir.join("functools.py");
std::fs::write(&functools_path, "def update_wrapper(): ...").unwrap();
let functools_module = resolve_module(&db, ModuleName::new("functools"))?.unwrap();
assert_eq!(
Some(functools_module),
resolve_module(&db, ModuleName::new("functools"))?
);
assert_eq!(&stdlib_dir, functools_module.path(&db)?.root().path());
assert_eq!(ModuleKind::Module, functools_module.kind(&db)?);
assert_eq!(
&functools_path,
&*db.file_path(functools_module.path(&db)?.file())
);
assert_eq!(
Some(functools_module),
path_to_module(&db, &functools_path)?
);
Ok(())
}
#[test]
fn first_party_precedence_over_stdlib() -> anyhow::Result<()> {
let TestCase {
db,
src,
custom_typeshed,
..
} = create_resolver()?;
let stdlib_dir = custom_typeshed.join(TYPESHED_STDLIB_DIRECTORY);
std::fs::create_dir_all(&stdlib_dir).unwrap();
std::fs::create_dir_all(&src).unwrap();
let stdlib_functools_path = stdlib_dir.join("functools.py");
let first_party_functools_path = src.join("functools.py");
std::fs::write(stdlib_functools_path, "def update_wrapper(): ...").unwrap();
std::fs::write(&first_party_functools_path, "def update_wrapper(): ...").unwrap();
let functools_module = resolve_module(&db, ModuleName::new("functools"))?.unwrap();
assert_eq!(
Some(functools_module),
resolve_module(&db, ModuleName::new("functools"))?
);
assert_eq!(&src, functools_module.path(&db).unwrap().root().path());
assert_eq!(ModuleKind::Module, functools_module.kind(&db)?);
assert_eq!(
&first_party_functools_path,
&*db.file_path(functools_module.path(&db)?.file())
);
assert_eq!(
Some(functools_module),
path_to_module(&db, &first_party_functools_path)?
);
Ok(())
}
#[test]
fn resolve_package() -> anyhow::Result<()> {
let TestCase {
@@ -736,7 +925,7 @@ mod tests {
..
} = create_resolver()?;
let foo_dir = src.path().join("foo");
let foo_dir = src.join("foo");
let foo_path = foo_dir.join("__init__.py");
std::fs::create_dir(&foo_dir)?;
std::fs::write(&foo_path, "print('Hello, world!')")?;
@@ -744,7 +933,7 @@ mod tests {
let foo_module = resolve_module(&db, ModuleName::new("foo"))?.unwrap();
assert_eq!(ModuleName::new("foo"), foo_module.name(&db)?);
assert_eq!(&src, foo_module.path(&db)?.root());
assert_eq!(&src, foo_module.path(&db)?.root().path());
assert_eq!(&foo_path, &*db.file_path(foo_module.path(&db)?.file()));
assert_eq!(Some(foo_module), path_to_module(&db, &foo_path)?);
@@ -764,17 +953,17 @@ mod tests {
..
} = create_resolver()?;
let foo_dir = src.path().join("foo");
let foo_dir = src.join("foo");
let foo_init = foo_dir.join("__init__.py");
std::fs::create_dir(&foo_dir)?;
std::fs::write(&foo_init, "print('Hello, world!')")?;
let foo_py = src.path().join("foo.py");
let foo_py = src.join("foo.py");
std::fs::write(&foo_py, "print('Hello, world!')")?;
let foo_module = resolve_module(&db, ModuleName::new("foo"))?.unwrap();
assert_eq!(&src, foo_module.path(&db)?.root());
assert_eq!(&src, foo_module.path(&db)?.root().path());
assert_eq!(&foo_init, &*db.file_path(foo_module.path(&db)?.file()));
assert_eq!(ModuleKind::Package, foo_module.kind(&db)?);
@@ -793,14 +982,14 @@ mod tests {
..
} = create_resolver()?;
let foo_stub = src.path().join("foo.pyi");
let foo_py = src.path().join("foo.py");
let foo_stub = src.join("foo.pyi");
let foo_py = src.join("foo.py");
std::fs::write(&foo_stub, "x: int")?;
std::fs::write(&foo_py, "print('Hello, world!')")?;
let foo = resolve_module(&db, ModuleName::new("foo"))?.unwrap();
assert_eq!(&src, foo.path(&db)?.root());
assert_eq!(&src, foo.path(&db)?.root().path());
assert_eq!(&foo_stub, &*db.file_path(foo.path(&db)?.file()));
assert_eq!(Some(foo), path_to_module(&db, &foo_stub)?);
@@ -818,7 +1007,7 @@ mod tests {
..
} = create_resolver()?;
let foo = src.path().join("foo");
let foo = src.join("foo");
let bar = foo.join("bar");
let baz = bar.join("baz.py");
@@ -829,7 +1018,7 @@ mod tests {
let baz_module = resolve_module(&db, ModuleName::new("foo.bar.baz"))?.unwrap();
assert_eq!(&src, baz_module.path(&db)?.root());
assert_eq!(&src, baz_module.path(&db)?.root().path());
assert_eq!(&baz, &*db.file_path(baz_module.path(&db)?.file()));
assert_eq!(Some(baz_module), path_to_module(&db, &baz)?);
@@ -844,6 +1033,7 @@ mod tests {
temp_dir: _,
src,
site_packages,
..
} = create_resolver()?;
// From [PEP420](https://peps.python.org/pep-0420/#nested-namespace-packages).
@@ -859,14 +1049,14 @@ mod tests {
// two.py
// ```
let parent1 = src.path().join("parent");
let parent1 = src.join("parent");
let child1 = parent1.join("child");
let one = child1.join("one.py");
std::fs::create_dir_all(child1)?;
std::fs::write(&one, "print('Hello, world!')")?;
let parent2 = site_packages.path().join("parent");
let parent2 = site_packages.join("parent");
let child2 = parent2.join("child");
let two = child2.join("two.py");
@@ -890,6 +1080,7 @@ mod tests {
temp_dir: _,
src,
site_packages,
..
} = create_resolver()?;
// Adopted test case from the [PEP420 examples](https://peps.python.org/pep-0420/#nested-namespace-packages).
@@ -905,7 +1096,7 @@ mod tests {
// two.py
// ```
let parent1 = src.path().join("parent");
let parent1 = src.join("parent");
let child1 = parent1.join("child");
let one = child1.join("one.py");
@@ -913,7 +1104,7 @@ mod tests {
std::fs::write(child1.join("__init__.py"), "print('Hello, world!')")?;
std::fs::write(&one, "print('Hello, world!')")?;
let parent2 = site_packages.path().join("parent");
let parent2 = site_packages.join("parent");
let child2 = parent2.join("child");
let two = child2.join("two.py");
@@ -938,17 +1129,18 @@ mod tests {
src,
site_packages,
temp_dir: _temp_dir,
..
} = create_resolver()?;
let foo_src = src.path().join("foo.py");
let foo_site_packages = site_packages.path().join("foo.py");
let foo_src = src.join("foo.py");
let foo_site_packages = site_packages.join("foo.py");
std::fs::write(&foo_src, "")?;
std::fs::write(&foo_site_packages, "")?;
let foo_module = resolve_module(&db, ModuleName::new("foo"))?.unwrap();
assert_eq!(&src, foo_module.path(&db)?.root());
assert_eq!(&src, foo_module.path(&db)?.root().path());
assert_eq!(&foo_src, &*db.file_path(foo_module.path(&db)?.file()));
assert_eq!(Some(foo_module), path_to_module(&db, &foo_src)?);
@@ -967,8 +1159,8 @@ mod tests {
..
} = create_resolver()?;
let foo = src.path().join("foo.py");
let bar = src.path().join("bar.py");
let foo = src.join("foo.py");
let bar = src.join("bar.py");
std::fs::write(&foo, "")?;
std::os::unix::fs::symlink(&foo, &bar)?;
@@ -978,12 +1170,12 @@ mod tests {
assert_ne!(foo_module, bar_module);
assert_eq!(&src, foo_module.path(&db)?.root());
assert_eq!(&src, foo_module.path(&db)?.root().path());
assert_eq!(&foo, &*db.file_path(foo_module.path(&db)?.file()));
// Bar has a different name but it should point to the same file.
assert_eq!(&src, bar_module.path(&db)?.root());
assert_eq!(&src, bar_module.path(&db)?.root().path());
assert_eq!(foo_module.path(&db)?.file(), bar_module.path(&db)?.file());
assert_eq!(&foo, &*db.file_path(bar_module.path(&db)?.file()));
@@ -1002,7 +1194,7 @@ mod tests {
..
} = create_resolver()?;
let foo_dir = src.path().join("foo");
let foo_dir = src.join("foo");
let foo_path = foo_dir.join("__init__.py");
let bar_path = foo_dir.join("bar.py");

View File

@@ -1,85 +1,33 @@
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
use ruff_python_ast as ast;
use ruff_python_parser::{Mode, ParseError};
use ruff_text_size::{Ranged, TextRange};
use ruff_python_ast::ModModule;
use ruff_python_parser::Parsed;
use crate::cache::KeyValueCache;
use crate::db::{QueryResult, SourceDb};
use crate::files::FileId;
use crate::source::source_text;
#[derive(Debug, Clone, PartialEq)]
pub struct Parsed {
inner: Arc<ParsedInner>,
}
#[derive(Debug, PartialEq)]
struct ParsedInner {
ast: ast::ModModule,
errors: Vec<ParseError>,
}
impl Parsed {
fn new(ast: ast::ModModule, errors: Vec<ParseError>) -> Self {
Self {
inner: Arc::new(ParsedInner { ast, errors }),
}
}
pub(crate) fn from_text(text: &str) -> Self {
let result = ruff_python_parser::parse(text, Mode::Module);
let (module, errors) = match result {
Ok(ast::Mod::Module(module)) => (module, vec![]),
Ok(ast::Mod::Expression(expression)) => (
ast::ModModule {
range: expression.range(),
body: vec![ast::Stmt::Expr(ast::StmtExpr {
range: expression.range(),
value: expression.body,
})],
},
vec![],
),
Err(errors) => (
ast::ModModule {
range: TextRange::default(),
body: Vec::new(),
},
vec![errors],
),
};
Parsed::new(module, errors)
}
pub fn ast(&self) -> &ast::ModModule {
&self.inner.ast
}
pub fn errors(&self) -> &[ParseError] {
&self.inner.errors
}
}
#[tracing::instrument(level = "debug", skip(db))]
pub(crate) fn parse(db: &dyn SourceDb, file_id: FileId) -> QueryResult<Parsed> {
pub(crate) fn parse(db: &dyn SourceDb, file_id: FileId) -> QueryResult<Arc<Parsed<ModModule>>> {
let jar = db.jar()?;
jar.parsed.get(&file_id, |file_id| {
let source = source_text(db, *file_id)?;
Ok(Parsed::from_text(source.text()))
Ok(Arc::new(ruff_python_parser::parse_unchecked_source(
source.text(),
source.kind().into(),
)))
})
}
#[derive(Debug, Default)]
pub struct ParsedStorage(KeyValueCache<FileId, Parsed>);
pub struct ParsedStorage(KeyValueCache<FileId, Arc<Parsed<ModModule>>>);
impl Deref for ParsedStorage {
type Target = KeyValueCache<FileId, Parsed>;
type Target = KeyValueCache<FileId, Arc<Parsed<ModModule>>>;
fn deref(&self) -> &Self::Target {
&self.0

View File

@@ -6,7 +6,7 @@ use crate::files::FileId;
use crate::lint::{lint_semantic, lint_syntax, Diagnostics};
use crate::module::{file_to_module, resolve_module};
use crate::program::Program;
use crate::symbols::{symbol_table, Dependency};
use crate::semantic::{semantic_index, Dependency};
impl Program {
/// Checks all open files in the workspace and its dependencies.
@@ -28,8 +28,8 @@ impl Program {
fn check_file(&self, file: FileId, context: &CheckFileContext) -> QueryResult<Diagnostics> {
self.cancelled()?;
let symbol_table = symbol_table(self, file)?;
let dependencies = symbol_table.dependencies();
let index = semantic_index(self, file)?;
let dependencies = index.symbol_table().dependencies();
if !dependencies.is_empty() {
let module = file_to_module(self, file)?;

View File

@@ -42,8 +42,8 @@ impl Program {
let (source, semantic, lint) = self.jars_mut();
for change in aggregated_changes.iter() {
semantic.module_resolver.remove_module(change.id);
semantic.symbol_tables.remove(&change.id);
semantic.module_resolver.remove_module_by_file(change.id);
semantic.semantic_indices.remove(&change.id);
source.sources.remove(&change.id);
source.parsed.remove(&change.id);
// TODO: remove all dependent modules as well

View File

@@ -0,0 +1,882 @@
use std::num::NonZeroU32;
use ruff_python_ast as ast;
use ruff_python_ast::visitor::source_order::SourceOrderVisitor;
use ruff_python_ast::AstNode;
use crate::ast_ids::{NodeKey, TypedNodeKey};
use crate::cache::KeyValueCache;
use crate::db::{QueryResult, SemanticDb, SemanticJar};
use crate::files::FileId;
use crate::module::Module;
use crate::module::ModuleName;
use crate::parse::parse;
use crate::Name;
pub(crate) use definitions::Definition;
use definitions::{ImportDefinition, ImportFromDefinition};
pub(crate) use flow_graph::ConstrainedDefinition;
use flow_graph::{FlowGraph, FlowGraphBuilder, FlowNodeId, ReachableDefinitionsIterator};
use ruff_index::{newtype_index, IndexVec};
use rustc_hash::FxHashMap;
use std::ops::{Deref, DerefMut};
use std::sync::Arc;
pub(crate) use symbol_table::{Dependency, SymbolId};
use symbol_table::{ScopeId, ScopeKind, SymbolFlags, SymbolTable, SymbolTableBuilder};
pub(crate) use types::{infer_definition_type, infer_symbol_public_type, Type, TypeStore};
mod definitions;
mod flow_graph;
mod symbol_table;
mod types;
#[tracing::instrument(level = "debug", skip(db))]
pub fn semantic_index(db: &dyn SemanticDb, file_id: FileId) -> QueryResult<Arc<SemanticIndex>> {
let jar: &SemanticJar = db.jar()?;
jar.semantic_indices.get(&file_id, |_| {
let parsed = parse(db.upcast(), file_id)?;
Ok(Arc::from(SemanticIndex::from_ast(parsed.syntax())))
})
}
#[tracing::instrument(level = "debug", skip(db))]
pub fn resolve_global_symbol(
db: &dyn SemanticDb,
module: Module,
name: &str,
) -> QueryResult<Option<GlobalSymbolId>> {
let file_id = module.path(db)?.file();
let symbol_table = &semantic_index(db, file_id)?.symbol_table;
let Some(symbol_id) = symbol_table.root_symbol_id_by_name(name) else {
return Ok(None);
};
Ok(Some(GlobalSymbolId { file_id, symbol_id }))
}
#[newtype_index]
pub struct ExpressionId;
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub struct GlobalSymbolId {
pub(crate) file_id: FileId,
pub(crate) symbol_id: SymbolId,
}
#[derive(Debug)]
pub struct SemanticIndex {
symbol_table: SymbolTable,
flow_graph: FlowGraph,
expressions: FxHashMap<NodeKey, ExpressionId>,
expressions_by_id: IndexVec<ExpressionId, NodeKey>,
}
impl SemanticIndex {
pub fn from_ast(module: &ast::ModModule) -> Self {
let root_scope_id = SymbolTable::root_scope_id();
let mut indexer = SemanticIndexer {
symbol_table_builder: SymbolTableBuilder::new(),
flow_graph_builder: FlowGraphBuilder::new(),
scopes: vec![ScopeState {
scope_id: root_scope_id,
current_flow_node_id: FlowGraph::start(),
}],
expressions: FxHashMap::default(),
expressions_by_id: IndexVec::default(),
current_definition: None,
};
indexer.visit_body(&module.body);
indexer.finish()
}
fn resolve_expression_id<'a>(
&self,
ast: &'a ast::ModModule,
expression_id: ExpressionId,
) -> ast::AnyNodeRef<'a> {
let node_key = self.expressions_by_id[expression_id];
node_key
.resolve(ast.as_any_node_ref())
.expect("node to resolve")
}
/// Return an iterator over all definitions of `symbol_id` reachable from `use_expr`. The value
/// of `symbol_id` in `use_expr` must originate from one of the iterated definitions (or from
/// an external reassignment of the name outside of this scope).
pub fn reachable_definitions(
&self,
symbol_id: SymbolId,
use_expr: &ast::Expr,
) -> ReachableDefinitionsIterator {
let expression_id = self.expression_id(use_expr);
ReachableDefinitionsIterator::new(
&self.flow_graph,
symbol_id,
self.flow_graph.for_expr(expression_id),
)
}
pub fn expression_id(&self, expression: &ast::Expr) -> ExpressionId {
self.expressions[&NodeKey::from_node(expression.into())]
}
pub fn symbol_table(&self) -> &SymbolTable {
&self.symbol_table
}
}
#[derive(Debug)]
struct ScopeState {
scope_id: ScopeId,
current_flow_node_id: FlowNodeId,
}
#[derive(Debug)]
struct SemanticIndexer {
symbol_table_builder: SymbolTableBuilder,
flow_graph_builder: FlowGraphBuilder,
scopes: Vec<ScopeState>,
/// the definition whose target(s) we are currently walking
current_definition: Option<Definition>,
expressions: FxHashMap<NodeKey, ExpressionId>,
expressions_by_id: IndexVec<ExpressionId, NodeKey>,
}
impl SemanticIndexer {
pub(crate) fn finish(mut self) -> SemanticIndex {
let SemanticIndexer {
flow_graph_builder,
symbol_table_builder,
..
} = self;
self.expressions.shrink_to_fit();
self.expressions_by_id.shrink_to_fit();
SemanticIndex {
flow_graph: flow_graph_builder.finish(),
symbol_table: symbol_table_builder.finish(),
expressions: self.expressions,
expressions_by_id: self.expressions_by_id,
}
}
fn set_current_flow_node(&mut self, new_flow_node_id: FlowNodeId) {
let scope_state = self.scopes.last_mut().expect("scope stack is never empty");
scope_state.current_flow_node_id = new_flow_node_id;
}
fn current_flow_node(&self) -> FlowNodeId {
self.scopes
.last()
.expect("scope stack is never empty")
.current_flow_node_id
}
fn add_or_update_symbol(&mut self, identifier: &str, flags: SymbolFlags) -> SymbolId {
self.symbol_table_builder
.add_or_update_symbol(self.cur_scope(), identifier, flags)
}
fn add_or_update_symbol_with_def(
&mut self,
identifier: &str,
definition: Definition,
) -> SymbolId {
let symbol_id = self.add_or_update_symbol(identifier, SymbolFlags::IS_DEFINED);
self.symbol_table_builder
.add_definition(symbol_id, definition.clone());
let new_flow_node_id =
self.flow_graph_builder
.add_definition(symbol_id, definition, self.current_flow_node());
self.set_current_flow_node(new_flow_node_id);
symbol_id
}
fn push_scope(
&mut self,
name: &str,
kind: ScopeKind,
definition: Option<Definition>,
defining_symbol: Option<SymbolId>,
) -> ScopeId {
let scope_id = self.symbol_table_builder.add_child_scope(
self.cur_scope(),
name,
kind,
definition,
defining_symbol,
);
self.scopes.push(ScopeState {
scope_id,
current_flow_node_id: FlowGraph::start(),
});
scope_id
}
fn pop_scope(&mut self) -> ScopeId {
self.scopes
.pop()
.expect("Scope stack should never be empty")
.scope_id
}
fn cur_scope(&self) -> ScopeId {
self.scopes
.last()
.expect("Scope stack should never be empty")
.scope_id
}
fn record_scope_for_node(&mut self, node_key: NodeKey, scope_id: ScopeId) {
self.symbol_table_builder
.record_scope_for_node(node_key, scope_id);
}
fn insert_constraint(&mut self, expr: &ast::Expr) {
let node_key = NodeKey::from_node(expr.into());
let expression_id = self.expressions[&node_key];
let constraint = self
.flow_graph_builder
.add_constraint(self.current_flow_node(), expression_id);
self.set_current_flow_node(constraint);
}
fn with_type_params(
&mut self,
name: &str,
params: &Option<Box<ast::TypeParams>>,
definition: Option<Definition>,
defining_symbol: Option<SymbolId>,
nested: impl FnOnce(&mut Self) -> ScopeId,
) -> ScopeId {
if let Some(type_params) = params {
self.push_scope(name, ScopeKind::Annotation, definition, defining_symbol);
for type_param in &type_params.type_params {
let name = match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar { name, .. }) => name,
ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { name, .. }) => name,
ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { name, .. }) => name,
};
self.add_or_update_symbol(name, SymbolFlags::IS_DEFINED);
}
}
let scope_id = nested(self);
if params.is_some() {
self.pop_scope();
}
scope_id
}
}
impl SourceOrderVisitor<'_> for SemanticIndexer {
fn visit_expr(&mut self, expr: &ast::Expr) {
let node_key = NodeKey::from_node(expr.into());
let expression_id = self.expressions_by_id.push(node_key);
let flow_expression_id = self
.flow_graph_builder
.record_expr(self.current_flow_node());
debug_assert_eq!(expression_id, flow_expression_id);
let symbol_expression_id = self
.symbol_table_builder
.record_expression(self.cur_scope());
debug_assert_eq!(expression_id, symbol_expression_id);
self.expressions.insert(node_key, expression_id);
match expr {
ast::Expr::Name(ast::ExprName { id, ctx, .. }) => {
let flags = match ctx {
ast::ExprContext::Load => SymbolFlags::IS_USED,
ast::ExprContext::Store => SymbolFlags::IS_DEFINED,
ast::ExprContext::Del => SymbolFlags::IS_DEFINED,
ast::ExprContext::Invalid => SymbolFlags::empty(),
};
self.add_or_update_symbol(id, flags);
if flags.contains(SymbolFlags::IS_DEFINED) {
if let Some(curdef) = self.current_definition.clone() {
self.add_or_update_symbol_with_def(id, curdef);
}
}
ast::visitor::source_order::walk_expr(self, expr);
}
ast::Expr::Named(node) => {
debug_assert!(self.current_definition.is_none());
self.current_definition =
Some(Definition::NamedExpr(TypedNodeKey::from_node(node)));
// TODO walrus in comprehensions is implicitly nonlocal
self.visit_expr(&node.target);
self.current_definition = None;
self.visit_expr(&node.value);
}
ast::Expr::If(ast::ExprIf {
body, test, orelse, ..
}) => {
// TODO detect statically known truthy or falsy test (via type inference, not naive
// AST inspection, so we can't simplify here, need to record test expression in CFG
// for later checking)
self.visit_expr(test);
let if_branch = self.flow_graph_builder.add_branch(self.current_flow_node());
self.set_current_flow_node(if_branch);
self.insert_constraint(test);
self.visit_expr(body);
let post_body = self.current_flow_node();
self.set_current_flow_node(if_branch);
self.visit_expr(orelse);
let post_else = self
.flow_graph_builder
.add_phi(self.current_flow_node(), post_body);
self.set_current_flow_node(post_else);
}
_ => {
ast::visitor::source_order::walk_expr(self, expr);
}
}
}
fn visit_stmt(&mut self, stmt: &ast::Stmt) {
// TODO need to capture more definition statements here
match stmt {
ast::Stmt::ClassDef(node) => {
let node_key = TypedNodeKey::from_node(node);
let def = Definition::ClassDef(node_key.clone());
let symbol_id = self.add_or_update_symbol_with_def(&node.name, def.clone());
for decorator in &node.decorator_list {
self.visit_decorator(decorator);
}
let scope_id = self.with_type_params(
&node.name,
&node.type_params,
Some(def.clone()),
Some(symbol_id),
|indexer| {
if let Some(arguments) = &node.arguments {
indexer.visit_arguments(arguments);
}
let scope_id = indexer.push_scope(
&node.name,
ScopeKind::Class,
Some(def.clone()),
Some(symbol_id),
);
indexer.visit_body(&node.body);
indexer.pop_scope();
scope_id
},
);
self.record_scope_for_node(*node_key.erased(), scope_id);
}
ast::Stmt::FunctionDef(node) => {
let node_key = TypedNodeKey::from_node(node);
let def = Definition::FunctionDef(node_key.clone());
let symbol_id = self.add_or_update_symbol_with_def(&node.name, def.clone());
for decorator in &node.decorator_list {
self.visit_decorator(decorator);
}
let scope_id = self.with_type_params(
&node.name,
&node.type_params,
Some(def.clone()),
Some(symbol_id),
|indexer| {
indexer.visit_parameters(&node.parameters);
for expr in &node.returns {
indexer.visit_annotation(expr);
}
let scope_id = indexer.push_scope(
&node.name,
ScopeKind::Function,
Some(def.clone()),
Some(symbol_id),
);
indexer.visit_body(&node.body);
indexer.pop_scope();
scope_id
},
);
self.record_scope_for_node(*node_key.erased(), scope_id);
}
ast::Stmt::Import(ast::StmtImport { names, .. }) => {
for alias in names {
let symbol_name = if let Some(asname) = &alias.asname {
asname.id.as_str()
} else {
alias.name.id.split('.').next().unwrap()
};
let module = ModuleName::new(&alias.name.id);
let def = Definition::Import(ImportDefinition {
module: module.clone(),
});
self.add_or_update_symbol_with_def(symbol_name, def);
self.symbol_table_builder
.add_dependency(Dependency::Module(module));
}
}
ast::Stmt::ImportFrom(ast::StmtImportFrom {
module,
names,
level,
..
}) => {
let module = module.as_ref().map(|m| ModuleName::new(&m.id));
for alias in names {
let symbol_name = if let Some(asname) = &alias.asname {
asname.id.as_str()
} else {
alias.name.id.as_str()
};
let def = Definition::ImportFrom(ImportFromDefinition {
module: module.clone(),
name: Name::new(&alias.name.id),
level: *level,
});
self.add_or_update_symbol_with_def(symbol_name, def);
}
let dependency = if let Some(module) = module {
match NonZeroU32::new(*level) {
Some(level) => Dependency::Relative {
level,
module: Some(module),
},
None => Dependency::Module(module),
}
} else {
Dependency::Relative {
level: NonZeroU32::new(*level)
.expect("Import without a module to have a level > 0"),
module,
}
};
self.symbol_table_builder.add_dependency(dependency);
}
ast::Stmt::Assign(node) => {
debug_assert!(self.current_definition.is_none());
self.visit_expr(&node.value);
self.current_definition =
Some(Definition::Assignment(TypedNodeKey::from_node(node)));
for expr in &node.targets {
self.visit_expr(expr);
}
self.current_definition = None;
}
ast::Stmt::If(node) => {
// TODO detect statically known truthy or falsy test (via type inference, not naive
// AST inspection, so we can't simplify here, need to record test expression in CFG
// for later checking)
// we visit the if "test" condition first regardless
self.visit_expr(&node.test);
// create branch node: does the if test pass or not?
let if_branch = self.flow_graph_builder.add_branch(self.current_flow_node());
// visit the body of the `if` clause
self.set_current_flow_node(if_branch);
self.insert_constraint(&node.test);
self.visit_body(&node.body);
// Flow node for the last if/elif condition branch; represents the "no branch
// taken yet" possibility (where "taking a branch" means that the condition in an
// if or elif evaluated to true and control flow went into that clause).
let mut prior_branch = if_branch;
// Flow node for the state after the prior if/elif/else clause; represents "we have
// taken one of the branches up to this point." Initially set to the post-if-clause
// state, later will be set to the phi node joining that possible path with the
// possibility that we took a later if/elif/else clause instead.
let mut post_prior_clause = self.current_flow_node();
// Flag to mark if the final clause is an "else" -- if so, that means the "match no
// clauses" path is not possible, we have to go through one of the clauses.
let mut last_branch_is_else = false;
for clause in &node.elif_else_clauses {
if let Some(test) = &clause.test {
self.visit_expr(test);
// This is an elif clause. Create a new branch node. Its predecessor is the
// previous branch node, because we can only take one branch in an entire
// if/elif/else chain, so if we take this branch, it can only be because we
// didn't take the previous one.
prior_branch = self.flow_graph_builder.add_branch(prior_branch);
self.set_current_flow_node(prior_branch);
self.insert_constraint(test);
} else {
// This is an else clause. No need to create a branch node; there's no
// branch here, if we haven't taken any previous branch, we definitely go
// into the "else" clause.
self.set_current_flow_node(prior_branch);
last_branch_is_else = true;
}
self.visit_elif_else_clause(clause);
// Update `post_prior_clause` to a new phi node joining the possibility that we
// took any of the previous branches with the possibility that we took the one
// just visited.
post_prior_clause = self
.flow_graph_builder
.add_phi(self.current_flow_node(), post_prior_clause);
}
if !last_branch_is_else {
// Final branch was not an "else", which means it's possible we took zero
// branches in the entire if/elif chain, so we need one more phi node to join
// the "no branches taken" possibility.
post_prior_clause = self
.flow_graph_builder
.add_phi(post_prior_clause, prior_branch);
}
// Onward, with current flow node set to our final Phi node.
self.set_current_flow_node(post_prior_clause);
}
_ => {
ast::visitor::source_order::walk_stmt(self, stmt);
}
}
}
}
#[derive(Debug, Default)]
pub struct SemanticIndexStorage(KeyValueCache<FileId, Arc<SemanticIndex>>);
impl Deref for SemanticIndexStorage {
type Target = KeyValueCache<FileId, Arc<SemanticIndex>>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for SemanticIndexStorage {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
#[cfg(test)]
mod tests {
use crate::semantic::symbol_table::{Symbol, SymbolIterator};
use ruff_python_ast as ast;
use ruff_python_ast::ModModule;
use ruff_python_parser::{Mode, Parsed};
use super::{Definition, ScopeKind, SemanticIndex, SymbolId};
fn parse(code: &str) -> Parsed<ModModule> {
ruff_python_parser::parse_unchecked(code, Mode::Module)
.try_into_module()
.unwrap()
}
fn names<I>(it: SymbolIterator<I>) -> Vec<&str>
where
I: Iterator<Item = SymbolId>,
{
let mut symbols: Vec<_> = it.map(Symbol::name).collect();
symbols.sort_unstable();
symbols
}
#[test]
fn empty() {
let parsed = parse("");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()).len(), 0);
}
#[test]
fn simple() {
let parsed = parse("x");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("x").unwrap())
.len(),
0
);
}
#[test]
fn annotation_only() {
let parsed = parse("x: int");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["int", "x"]);
// TODO record definition
}
#[test]
fn import() {
let parsed = parse("import foo");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["foo"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("foo").unwrap())
.len(),
1
);
}
#[test]
fn import_sub() {
let parsed = parse("import foo.bar");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["foo"]);
}
#[test]
fn import_as() {
let parsed = parse("import foo.bar as baz");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["baz"]);
}
#[test]
fn import_from() {
let parsed = parse("from bar import foo");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["foo"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("foo").unwrap())
.len(),
1
);
assert!(
table.root_symbol_id_by_name("foo").is_some_and(|sid| {
let s = sid.symbol(&table);
s.is_defined() || !s.is_used()
}),
"symbols that are defined get the defined flag"
);
}
#[test]
fn assign() {
let parsed = parse("x = foo");
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["foo", "x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("x").unwrap())
.len(),
1
);
assert!(
table.root_symbol_id_by_name("foo").is_some_and(|sid| {
let s = sid.symbol(&table);
!s.is_defined() && s.is_used()
}),
"a symbol used but not defined in a scope should have only the used flag"
);
}
#[test]
fn class_scope() {
let parsed = parse(
"
class C:
x = 1
y = 2
",
);
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["C", "y"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let c_scope = scopes[0].scope(&table);
assert_eq!(c_scope.kind(), ScopeKind::Class);
assert_eq!(c_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("C").unwrap())
.len(),
1
);
}
#[test]
fn func_scope() {
let parsed = parse(
"
def func():
x = 1
y = 2
",
);
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["func", "y"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let func_scope = scopes[0].scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Function);
assert_eq!(func_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("func").unwrap())
.len(),
1
);
}
#[test]
fn dupes() {
let parsed = parse(
"
def func():
x = 1
def func():
y = 2
",
);
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["func"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 2);
let func_scope_1 = scopes[0].scope(&table);
let func_scope_2 = scopes[1].scope(&table);
assert_eq!(func_scope_1.kind(), ScopeKind::Function);
assert_eq!(func_scope_1.name(), "func");
assert_eq!(func_scope_2.kind(), ScopeKind::Function);
assert_eq!(func_scope_2.name(), "func");
assert_eq!(names(table.symbols_for_scope(scopes[0])), vec!["x"]);
assert_eq!(names(table.symbols_for_scope(scopes[1])), vec!["y"]);
assert_eq!(
table
.definitions(table.root_symbol_id_by_name("func").unwrap())
.len(),
2
);
}
#[test]
fn generic_func() {
let parsed = parse(
"
def func[T]():
x = 1
",
);
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["func"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let ann_scope_id = scopes[0];
let ann_scope = ann_scope_id.scope(&table);
assert_eq!(ann_scope.kind(), ScopeKind::Annotation);
assert_eq!(ann_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(ann_scope_id)), vec!["T"]);
let scopes = table.child_scope_ids_of(ann_scope_id);
assert_eq!(scopes.len(), 1);
let func_scope_id = scopes[0];
let func_scope = func_scope_id.scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Function);
assert_eq!(func_scope.name(), "func");
assert_eq!(names(table.symbols_for_scope(func_scope_id)), vec!["x"]);
}
#[test]
fn generic_class() {
let parsed = parse(
"
class C[T]:
x = 1
",
);
let table = SemanticIndex::from_ast(parsed.syntax()).symbol_table;
assert_eq!(names(table.root_symbols()), vec!["C"]);
let scopes = table.root_child_scope_ids();
assert_eq!(scopes.len(), 1);
let ann_scope_id = scopes[0];
let ann_scope = ann_scope_id.scope(&table);
assert_eq!(ann_scope.kind(), ScopeKind::Annotation);
assert_eq!(ann_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(ann_scope_id)), vec!["T"]);
assert!(
table
.symbol_by_name(ann_scope_id, "T")
.is_some_and(|s| s.is_defined() && !s.is_used()),
"type parameters are defined by the scope that introduces them"
);
let scopes = table.child_scope_ids_of(ann_scope_id);
assert_eq!(scopes.len(), 1);
let func_scope_id = scopes[0];
let func_scope = func_scope_id.scope(&table);
assert_eq!(func_scope.kind(), ScopeKind::Class);
assert_eq!(func_scope.name(), "C");
assert_eq!(names(table.symbols_for_scope(func_scope_id)), vec!["x"]);
}
#[test]
fn reachability_trivial() {
let parsed = parse("x = 1; x");
let ast = parsed.syntax();
let index = SemanticIndex::from_ast(ast);
let table = &index.symbol_table;
let x_sym = table
.root_symbol_id_by_name("x")
.expect("x symbol should exist");
let ast::Stmt::Expr(ast::StmtExpr { value: x_use, .. }) = &ast.body[1] else {
panic!("should be an expr")
};
let x_defs: Vec<_> = index
.reachable_definitions(x_sym, x_use)
.map(|constrained_definition| constrained_definition.definition)
.collect();
assert_eq!(x_defs.len(), 1);
let Definition::Assignment(node_key) = &x_defs[0] else {
panic!("def should be an assignment")
};
let Some(def_node) = node_key.resolve(ast.into()) else {
panic!("node key should resolve")
};
let ast::Expr::NumberLiteral(ast::ExprNumberLiteral {
value: ast::Number::Int(num),
..
}) = &*def_node.value
else {
panic!("should be a number literal")
};
assert_eq!(*num, 1);
}
#[test]
fn expression_scope() {
let parsed = parse("x = 1;\ndef test():\n y = 4");
let ast = parsed.syntax();
let index = SemanticIndex::from_ast(ast);
let table = &index.symbol_table;
let x_sym = table
.root_symbol_by_name("x")
.expect("x symbol should exist");
let x_stmt = ast.body[0].as_assign_stmt().unwrap();
let x_id = index.expression_id(&x_stmt.targets[0]);
assert_eq!(table.scope_of_expression(x_id).kind(), ScopeKind::Module);
assert_eq!(table.scope_id_of_expression(x_id), x_sym.scope_id());
let def = ast.body[1].as_function_def_stmt().unwrap();
let y_stmt = def.body[0].as_assign_stmt().unwrap();
let y_id = index.expression_id(&y_stmt.targets[0]);
assert_eq!(table.scope_of_expression(y_id).kind(), ScopeKind::Function);
}
}

View File

@@ -0,0 +1,52 @@
use crate::ast_ids::TypedNodeKey;
use crate::semantic::ModuleName;
use crate::Name;
use ruff_python_ast as ast;
// TODO storing TypedNodeKey for definitions means we have to search to find them again in the AST;
// this is at best O(log n). If looking up definitions is a bottleneck we should look for
// alternatives here.
// TODO intern Definitions in SymbolTable and reference using IDs?
#[derive(Clone, Debug)]
pub enum Definition {
// For the import cases, we don't need reference to any arbitrary AST subtrees (annotations,
// RHS), and referencing just the import statement node is imprecise (a single import statement
// can assign many symbols, we'd have to re-search for the one we care about), so we just copy
// the small amount of information we need from the AST.
Import(ImportDefinition),
ImportFrom(ImportFromDefinition),
ClassDef(TypedNodeKey<ast::StmtClassDef>),
FunctionDef(TypedNodeKey<ast::StmtFunctionDef>),
Assignment(TypedNodeKey<ast::StmtAssign>),
AnnotatedAssignment(TypedNodeKey<ast::StmtAnnAssign>),
NamedExpr(TypedNodeKey<ast::ExprNamed>),
/// represents the implicit initial definition of every name as "unbound"
Unbound,
// TODO with statements, except handlers, function args...
}
#[derive(Clone, Debug)]
pub struct ImportDefinition {
pub module: ModuleName,
}
#[derive(Clone, Debug)]
pub struct ImportFromDefinition {
pub module: Option<ModuleName>,
pub name: Name,
pub level: u32,
}
impl ImportFromDefinition {
pub fn module(&self) -> Option<&ModuleName> {
self.module.as_ref()
}
pub fn name(&self) -> &Name {
&self.name
}
pub fn level(&self) -> u32 {
self.level
}
}

View File

@@ -0,0 +1,270 @@
use super::symbol_table::SymbolId;
use crate::semantic::{Definition, ExpressionId};
use ruff_index::{newtype_index, IndexVec};
use std::iter::FusedIterator;
use std::ops::Range;
#[newtype_index]
pub struct FlowNodeId;
#[derive(Debug)]
pub(crate) enum FlowNode {
Start,
Definition(DefinitionFlowNode),
Branch(BranchFlowNode),
Phi(PhiFlowNode),
Constraint(ConstraintFlowNode),
}
/// A point in control flow where a symbol is defined
#[derive(Debug)]
pub(crate) struct DefinitionFlowNode {
symbol_id: SymbolId,
definition: Definition,
predecessor: FlowNodeId,
}
/// A branch in control flow
#[derive(Debug)]
pub(crate) struct BranchFlowNode {
predecessor: FlowNodeId,
}
/// A join point where control flow paths come together
#[derive(Debug)]
pub(crate) struct PhiFlowNode {
first_predecessor: FlowNodeId,
second_predecessor: FlowNodeId,
}
/// A branch test which may apply constraints to a symbol's type
#[derive(Debug)]
pub(crate) struct ConstraintFlowNode {
predecessor: FlowNodeId,
test_expression: ExpressionId,
}
#[derive(Debug)]
pub struct FlowGraph {
flow_nodes_by_id: IndexVec<FlowNodeId, FlowNode>,
expression_map: IndexVec<ExpressionId, FlowNodeId>,
}
impl FlowGraph {
pub fn start() -> FlowNodeId {
FlowNodeId::from_usize(0)
}
pub fn for_expr(&self, expr: ExpressionId) -> FlowNodeId {
self.expression_map[expr]
}
}
#[derive(Debug)]
pub(crate) struct FlowGraphBuilder {
flow_graph: FlowGraph,
}
impl FlowGraphBuilder {
pub(crate) fn new() -> Self {
let mut graph = FlowGraph {
flow_nodes_by_id: IndexVec::default(),
expression_map: IndexVec::default(),
};
graph.flow_nodes_by_id.push(FlowNode::Start);
Self { flow_graph: graph }
}
pub(crate) fn add(&mut self, node: FlowNode) -> FlowNodeId {
self.flow_graph.flow_nodes_by_id.push(node)
}
pub(crate) fn add_definition(
&mut self,
symbol_id: SymbolId,
definition: Definition,
predecessor: FlowNodeId,
) -> FlowNodeId {
self.add(FlowNode::Definition(DefinitionFlowNode {
symbol_id,
definition,
predecessor,
}))
}
pub(crate) fn add_branch(&mut self, predecessor: FlowNodeId) -> FlowNodeId {
self.add(FlowNode::Branch(BranchFlowNode { predecessor }))
}
pub(crate) fn add_phi(
&mut self,
first_predecessor: FlowNodeId,
second_predecessor: FlowNodeId,
) -> FlowNodeId {
self.add(FlowNode::Phi(PhiFlowNode {
first_predecessor,
second_predecessor,
}))
}
pub(crate) fn add_constraint(
&mut self,
predecessor: FlowNodeId,
test_expression: ExpressionId,
) -> FlowNodeId {
self.add(FlowNode::Constraint(ConstraintFlowNode {
predecessor,
test_expression,
}))
}
pub(super) fn record_expr(&mut self, node_id: FlowNodeId) -> ExpressionId {
self.flow_graph.expression_map.push(node_id)
}
pub(super) fn finish(mut self) -> FlowGraph {
self.flow_graph.flow_nodes_by_id.shrink_to_fit();
self.flow_graph.expression_map.shrink_to_fit();
self.flow_graph
}
}
/// A definition, and the set of constraints between a use and the definition
#[derive(Debug, Clone)]
pub struct ConstrainedDefinition {
pub definition: Definition,
pub constraints: Vec<ExpressionId>,
}
/// A flow node and the constraints we passed through to reach it
#[derive(Debug)]
struct FlowState {
node_id: FlowNodeId,
constraints_range: Range<usize>,
}
#[derive(Debug)]
pub struct ReachableDefinitionsIterator<'a> {
flow_graph: &'a FlowGraph,
symbol_id: SymbolId,
pending: Vec<FlowState>,
constraints: Vec<ExpressionId>,
}
impl<'a> ReachableDefinitionsIterator<'a> {
pub fn new(flow_graph: &'a FlowGraph, symbol_id: SymbolId, start_node_id: FlowNodeId) -> Self {
Self {
flow_graph,
symbol_id,
pending: vec![FlowState {
node_id: start_node_id,
constraints_range: 0..0,
}],
constraints: vec![],
}
}
}
impl<'a> Iterator for ReachableDefinitionsIterator<'a> {
type Item = ConstrainedDefinition;
fn next(&mut self) -> Option<Self::Item> {
let FlowState {
mut node_id,
mut constraints_range,
} = self.pending.pop()?;
self.constraints.truncate(constraints_range.end + 1);
loop {
match &self.flow_graph.flow_nodes_by_id[node_id] {
FlowNode::Start => {
// constraints on unbound are irrelevant
return Some(ConstrainedDefinition {
definition: Definition::Unbound,
constraints: vec![],
});
}
FlowNode::Definition(def_node) => {
if def_node.symbol_id == self.symbol_id {
return Some(ConstrainedDefinition {
definition: def_node.definition.clone(),
constraints: self.constraints[constraints_range].to_vec(),
});
}
node_id = def_node.predecessor;
}
FlowNode::Branch(branch_node) => {
node_id = branch_node.predecessor;
}
FlowNode::Phi(phi_node) => {
self.pending.push(FlowState {
node_id: phi_node.first_predecessor,
constraints_range: constraints_range.clone(),
});
node_id = phi_node.second_predecessor;
}
FlowNode::Constraint(constraint_node) => {
node_id = constraint_node.predecessor;
self.constraints.push(constraint_node.test_expression);
constraints_range.end += 1;
}
}
}
}
}
impl<'a> FusedIterator for ReachableDefinitionsIterator<'a> {}
impl std::fmt::Display for FlowGraph {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
writeln!(f, "flowchart TD")?;
for (id, node) in self.flow_nodes_by_id.iter_enumerated() {
write!(f, " id{}", id.as_u32())?;
match node {
FlowNode::Start => writeln!(f, r"[\Start/]")?,
FlowNode::Definition(def_node) => {
writeln!(f, r"(Define symbol {})", def_node.symbol_id.as_u32())?;
writeln!(
f,
r" id{}-->id{}",
def_node.predecessor.as_u32(),
id.as_u32()
)?;
}
FlowNode::Branch(branch_node) => {
writeln!(f, r"{{Branch}}")?;
writeln!(
f,
r" id{}-->id{}",
branch_node.predecessor.as_u32(),
id.as_u32()
)?;
}
FlowNode::Phi(phi_node) => {
writeln!(f, r"((Phi))")?;
writeln!(
f,
r" id{}-->id{}",
phi_node.second_predecessor.as_u32(),
id.as_u32()
)?;
writeln!(
f,
r" id{}-->id{}",
phi_node.first_predecessor.as_u32(),
id.as_u32()
)?;
}
FlowNode::Constraint(constraint_node) => {
writeln!(f, r"((Constraint))")?;
writeln!(
f,
r" id{}-->id{}",
constraint_node.predecessor.as_u32(),
id.as_u32()
)?;
}
}
}
Ok(())
}
}

View File

@@ -0,0 +1,560 @@
#![allow(dead_code)]
use std::hash::{Hash, Hasher};
use std::iter::{Copied, DoubleEndedIterator, FusedIterator};
use std::num::NonZeroU32;
use bitflags::bitflags;
use hashbrown::hash_map::{Keys, RawEntryMut};
use rustc_hash::{FxHashMap, FxHasher};
use ruff_index::{newtype_index, IndexVec};
use crate::ast_ids::NodeKey;
use crate::module::ModuleName;
use crate::semantic::{Definition, ExpressionId};
use crate::Name;
type Map<K, V> = hashbrown::HashMap<K, V, ()>;
#[newtype_index]
pub struct ScopeId;
impl ScopeId {
pub fn scope(self, table: &SymbolTable) -> &Scope {
&table.scopes_by_id[self]
}
}
#[newtype_index]
pub struct SymbolId;
impl SymbolId {
pub fn symbol(self, table: &SymbolTable) -> &Symbol {
&table.symbols_by_id[self]
}
}
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum ScopeKind {
Module,
Annotation,
Class,
Function,
}
#[derive(Debug)]
pub struct Scope {
name: Name,
kind: ScopeKind,
parent: Option<ScopeId>,
children: Vec<ScopeId>,
/// the definition (e.g. class or function) that created this scope
definition: Option<Definition>,
/// the symbol (e.g. class or function) that owns this scope
defining_symbol: Option<SymbolId>,
/// symbol IDs, hashed by symbol name
symbols_by_name: Map<SymbolId, ()>,
}
impl Scope {
pub fn name(&self) -> &str {
self.name.as_str()
}
pub fn kind(&self) -> ScopeKind {
self.kind
}
pub fn definition(&self) -> Option<Definition> {
self.definition.clone()
}
pub fn defining_symbol(&self) -> Option<SymbolId> {
self.defining_symbol
}
}
#[derive(Debug)]
pub(crate) enum Kind {
FreeVar,
CellVar,
CellVarAssigned,
ExplicitGlobal,
ImplicitGlobal,
}
bitflags! {
#[derive(Copy,Clone,Debug)]
pub struct SymbolFlags: u8 {
const IS_USED = 1 << 0;
const IS_DEFINED = 1 << 1;
/// TODO: This flag is not yet set by anything
const MARKED_GLOBAL = 1 << 2;
/// TODO: This flag is not yet set by anything
const MARKED_NONLOCAL = 1 << 3;
}
}
#[derive(Debug)]
pub struct Symbol {
name: Name,
flags: SymbolFlags,
scope_id: ScopeId,
// kind: Kind,
}
impl Symbol {
pub fn name(&self) -> &str {
self.name.as_str()
}
pub fn scope_id(&self) -> ScopeId {
self.scope_id
}
/// Is the symbol used in its containing scope?
pub fn is_used(&self) -> bool {
self.flags.contains(SymbolFlags::IS_USED)
}
/// Is the symbol defined in its containing scope?
pub fn is_defined(&self) -> bool {
self.flags.contains(SymbolFlags::IS_DEFINED)
}
// TODO: implement Symbol.kind 2-pass analysis to categorize as: free-var, cell-var,
// explicit-global, implicit-global and implement Symbol.kind by modifying the preorder
// traversal code
}
#[derive(Debug, Clone)]
pub enum Dependency {
Module(ModuleName),
Relative {
level: NonZeroU32,
module: Option<ModuleName>,
},
}
/// Table of all symbols in all scopes for a module.
#[derive(Debug)]
pub struct SymbolTable {
scopes_by_id: IndexVec<ScopeId, Scope>,
symbols_by_id: IndexVec<SymbolId, Symbol>,
/// the definitions for each symbol
defs: FxHashMap<SymbolId, Vec<Definition>>,
/// map of AST node (e.g. class/function def) to sub-scope it creates
scopes_by_node: FxHashMap<NodeKey, ScopeId>,
/// Maps expressions to their enclosing scope.
expression_scopes: IndexVec<ExpressionId, ScopeId>,
/// dependencies of this module
dependencies: Vec<Dependency>,
}
impl SymbolTable {
pub fn dependencies(&self) -> &[Dependency] {
&self.dependencies
}
pub const fn root_scope_id() -> ScopeId {
ScopeId::from_usize(0)
}
pub fn root_scope(&self) -> &Scope {
&self.scopes_by_id[SymbolTable::root_scope_id()]
}
pub fn symbol_ids_for_scope(&self, scope_id: ScopeId) -> Copied<Keys<SymbolId, ()>> {
self.scopes_by_id[scope_id].symbols_by_name.keys().copied()
}
pub fn symbols_for_scope(
&self,
scope_id: ScopeId,
) -> SymbolIterator<Copied<Keys<SymbolId, ()>>> {
SymbolIterator {
table: self,
ids: self.symbol_ids_for_scope(scope_id),
}
}
pub fn root_symbol_ids(&self) -> Copied<Keys<SymbolId, ()>> {
self.symbol_ids_for_scope(SymbolTable::root_scope_id())
}
pub fn root_symbols(&self) -> SymbolIterator<Copied<Keys<SymbolId, ()>>> {
self.symbols_for_scope(SymbolTable::root_scope_id())
}
pub fn child_scope_ids_of(&self, scope_id: ScopeId) -> &[ScopeId] {
&self.scopes_by_id[scope_id].children
}
pub fn child_scopes_of(&self, scope_id: ScopeId) -> ScopeIterator<&[ScopeId]> {
ScopeIterator {
table: self,
ids: self.child_scope_ids_of(scope_id),
}
}
pub fn root_child_scope_ids(&self) -> &[ScopeId] {
self.child_scope_ids_of(SymbolTable::root_scope_id())
}
pub fn root_child_scopes(&self) -> ScopeIterator<&[ScopeId]> {
self.child_scopes_of(SymbolTable::root_scope_id())
}
pub fn symbol_id_by_name(&self, scope_id: ScopeId, name: &str) -> Option<SymbolId> {
let scope = &self.scopes_by_id[scope_id];
let hash = SymbolTable::hash_name(name);
let name = Name::new(name);
Some(
*scope
.symbols_by_name
.raw_entry()
.from_hash(hash, |symid| self.symbols_by_id[*symid].name == name)?
.0,
)
}
pub fn symbol_by_name(&self, scope_id: ScopeId, name: &str) -> Option<&Symbol> {
Some(&self.symbols_by_id[self.symbol_id_by_name(scope_id, name)?])
}
pub fn root_symbol_id_by_name(&self, name: &str) -> Option<SymbolId> {
self.symbol_id_by_name(SymbolTable::root_scope_id(), name)
}
pub fn root_symbol_by_name(&self, name: &str) -> Option<&Symbol> {
self.symbol_by_name(SymbolTable::root_scope_id(), name)
}
pub fn scope_id_of_symbol(&self, symbol_id: SymbolId) -> ScopeId {
self.symbols_by_id[symbol_id].scope_id
}
pub fn scope_of_symbol(&self, symbol_id: SymbolId) -> &Scope {
&self.scopes_by_id[self.scope_id_of_symbol(symbol_id)]
}
pub fn scope_id_of_expression(&self, expression: ExpressionId) -> ScopeId {
self.expression_scopes[expression]
}
pub fn scope_of_expression(&self, expr_id: ExpressionId) -> &Scope {
&self.scopes_by_id[self.scope_id_of_expression(expr_id)]
}
pub fn parent_scopes(
&self,
scope_id: ScopeId,
) -> ScopeIterator<impl Iterator<Item = ScopeId> + '_> {
ScopeIterator {
table: self,
ids: std::iter::successors(Some(scope_id), |scope| self.scopes_by_id[*scope].parent),
}
}
pub fn parent_scope(&self, scope_id: ScopeId) -> Option<ScopeId> {
self.scopes_by_id[scope_id].parent
}
pub fn scope_id_for_node(&self, node_key: &NodeKey) -> ScopeId {
self.scopes_by_node[node_key]
}
pub fn definitions(&self, symbol_id: SymbolId) -> &[Definition] {
self.defs
.get(&symbol_id)
.map(std::vec::Vec::as_slice)
.unwrap_or_default()
}
pub fn all_definitions(&self) -> impl Iterator<Item = (SymbolId, &Definition)> + '_ {
self.defs
.iter()
.flat_map(|(sym_id, defs)| defs.iter().map(move |def| (*sym_id, def)))
}
fn hash_name(name: &str) -> u64 {
let mut hasher = FxHasher::default();
name.hash(&mut hasher);
hasher.finish()
}
}
pub struct SymbolIterator<'a, I> {
table: &'a SymbolTable,
ids: I,
}
impl<'a, I> Iterator for SymbolIterator<'a, I>
where
I: Iterator<Item = SymbolId>,
{
type Item = &'a Symbol;
fn next(&mut self) -> Option<Self::Item> {
let id = self.ids.next()?;
Some(&self.table.symbols_by_id[id])
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.ids.size_hint()
}
}
impl<'a, I> FusedIterator for SymbolIterator<'a, I> where
I: Iterator<Item = SymbolId> + FusedIterator
{
}
impl<'a, I> DoubleEndedIterator for SymbolIterator<'a, I>
where
I: Iterator<Item = SymbolId> + DoubleEndedIterator,
{
fn next_back(&mut self) -> Option<Self::Item> {
let id = self.ids.next_back()?;
Some(&self.table.symbols_by_id[id])
}
}
// TODO maybe get rid of this and just do all data access via methods on ScopeId?
pub struct ScopeIterator<'a, I> {
table: &'a SymbolTable,
ids: I,
}
/// iterate (`ScopeId`, `Scope`) pairs for given `ScopeId` iterator
impl<'a, I> Iterator for ScopeIterator<'a, I>
where
I: Iterator<Item = ScopeId>,
{
type Item = (ScopeId, &'a Scope);
fn next(&mut self) -> Option<Self::Item> {
let id = self.ids.next()?;
Some((id, &self.table.scopes_by_id[id]))
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.ids.size_hint()
}
}
impl<'a, I> FusedIterator for ScopeIterator<'a, I> where I: Iterator<Item = ScopeId> + FusedIterator {}
impl<'a, I> DoubleEndedIterator for ScopeIterator<'a, I>
where
I: Iterator<Item = ScopeId> + DoubleEndedIterator,
{
fn next_back(&mut self) -> Option<Self::Item> {
let id = self.ids.next_back()?;
Some((id, &self.table.scopes_by_id[id]))
}
}
#[derive(Debug)]
pub(super) struct SymbolTableBuilder {
symbol_table: SymbolTable,
}
impl SymbolTableBuilder {
pub(super) fn new() -> Self {
let mut table = SymbolTable {
scopes_by_id: IndexVec::new(),
symbols_by_id: IndexVec::new(),
defs: FxHashMap::default(),
scopes_by_node: FxHashMap::default(),
expression_scopes: IndexVec::new(),
dependencies: Vec::new(),
};
table.scopes_by_id.push(Scope {
name: Name::new("<module>"),
kind: ScopeKind::Module,
parent: None,
children: Vec::new(),
definition: None,
defining_symbol: None,
symbols_by_name: Map::default(),
});
Self {
symbol_table: table,
}
}
pub(super) fn finish(self) -> SymbolTable {
let mut symbol_table = self.symbol_table;
symbol_table.scopes_by_id.shrink_to_fit();
symbol_table.symbols_by_id.shrink_to_fit();
symbol_table.defs.shrink_to_fit();
symbol_table.scopes_by_node.shrink_to_fit();
symbol_table.expression_scopes.shrink_to_fit();
symbol_table.dependencies.shrink_to_fit();
symbol_table
}
pub(super) fn add_or_update_symbol(
&mut self,
scope_id: ScopeId,
name: &str,
flags: SymbolFlags,
) -> SymbolId {
let hash = SymbolTable::hash_name(name);
let scope = &mut self.symbol_table.scopes_by_id[scope_id];
let name = Name::new(name);
let entry = scope
.symbols_by_name
.raw_entry_mut()
.from_hash(hash, |existing| {
self.symbol_table.symbols_by_id[*existing].name == name
});
match entry {
RawEntryMut::Occupied(entry) => {
if let Some(symbol) = self.symbol_table.symbols_by_id.get_mut(*entry.key()) {
symbol.flags.insert(flags);
};
*entry.key()
}
RawEntryMut::Vacant(entry) => {
let id = self.symbol_table.symbols_by_id.push(Symbol {
name,
flags,
scope_id,
});
entry.insert_with_hasher(hash, id, (), |symid| {
SymbolTable::hash_name(&self.symbol_table.symbols_by_id[*symid].name)
});
id
}
}
}
pub(super) fn add_definition(&mut self, symbol_id: SymbolId, definition: Definition) {
self.symbol_table
.defs
.entry(symbol_id)
.or_default()
.push(definition);
}
pub(super) fn add_child_scope(
&mut self,
parent_scope_id: ScopeId,
name: &str,
kind: ScopeKind,
definition: Option<Definition>,
defining_symbol: Option<SymbolId>,
) -> ScopeId {
let new_scope_id = self.symbol_table.scopes_by_id.push(Scope {
name: Name::new(name),
kind,
parent: Some(parent_scope_id),
children: Vec::new(),
definition,
defining_symbol,
symbols_by_name: Map::default(),
});
let parent_scope = &mut self.symbol_table.scopes_by_id[parent_scope_id];
parent_scope.children.push(new_scope_id);
new_scope_id
}
pub(super) fn record_scope_for_node(&mut self, node_key: NodeKey, scope_id: ScopeId) {
self.symbol_table.scopes_by_node.insert(node_key, scope_id);
}
pub(super) fn add_dependency(&mut self, dependency: Dependency) {
self.symbol_table.dependencies.push(dependency);
}
/// Records the scope for the current expression
pub(super) fn record_expression(&mut self, scope: ScopeId) -> ExpressionId {
self.symbol_table.expression_scopes.push(scope)
}
}
#[cfg(test)]
mod tests {
use super::{ScopeKind, SymbolFlags, SymbolTable, SymbolTableBuilder};
#[test]
fn insert_same_name_symbol_twice() {
let mut builder = SymbolTableBuilder::new();
let root_scope_id = SymbolTable::root_scope_id();
let symbol_id_1 =
builder.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::IS_DEFINED);
let symbol_id_2 = builder.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::IS_USED);
let table = builder.finish();
assert_eq!(symbol_id_1, symbol_id_2);
assert!(symbol_id_1.symbol(&table).is_used(), "flags must merge");
assert!(symbol_id_1.symbol(&table).is_defined(), "flags must merge");
}
#[test]
fn insert_different_named_symbols() {
let mut builder = SymbolTableBuilder::new();
let root_scope_id = SymbolTable::root_scope_id();
let symbol_id_1 = builder.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let symbol_id_2 = builder.add_or_update_symbol(root_scope_id, "bar", SymbolFlags::empty());
assert_ne!(symbol_id_1, symbol_id_2);
}
#[test]
fn add_child_scope_with_symbol() {
let mut builder = SymbolTableBuilder::new();
let root_scope_id = SymbolTable::root_scope_id();
let foo_symbol_top =
builder.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let c_scope = builder.add_child_scope(root_scope_id, "C", ScopeKind::Class, None, None);
let foo_symbol_inner = builder.add_or_update_symbol(c_scope, "foo", SymbolFlags::empty());
assert_ne!(foo_symbol_top, foo_symbol_inner);
}
#[test]
fn scope_from_id() {
let table = SymbolTableBuilder::new().finish();
let root_scope_id = SymbolTable::root_scope_id();
let scope = root_scope_id.scope(&table);
assert_eq!(scope.name.as_str(), "<module>");
assert_eq!(scope.kind, ScopeKind::Module);
}
#[test]
fn symbol_from_id() {
let mut builder = SymbolTableBuilder::new();
let root_scope_id = SymbolTable::root_scope_id();
let foo_symbol_id =
builder.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
let table = builder.finish();
let symbol = foo_symbol_id.symbol(&table);
assert_eq!(symbol.name(), "foo");
}
#[test]
fn bigger_symbol_table() {
let mut builder = SymbolTableBuilder::new();
let root_scope_id = SymbolTable::root_scope_id();
let foo_symbol_id =
builder.add_or_update_symbol(root_scope_id, "foo", SymbolFlags::empty());
builder.add_or_update_symbol(root_scope_id, "bar", SymbolFlags::empty());
builder.add_or_update_symbol(root_scope_id, "baz", SymbolFlags::empty());
builder.add_or_update_symbol(root_scope_id, "qux", SymbolFlags::empty());
let table = builder.finish();
let foo_symbol_id_2 = table
.root_symbol_id_by_name("foo")
.expect("foo symbol to be found");
assert_eq!(foo_symbol_id_2, foo_symbol_id);
}
}

View File

@@ -2,19 +2,23 @@
use crate::ast_ids::NodeKey;
use crate::db::{QueryResult, SemanticDb, SemanticJar};
use crate::files::FileId;
use crate::symbols::{symbol_table, GlobalSymbolId, ScopeId, ScopeKind, SymbolId};
use crate::module::{Module, ModuleName};
use crate::semantic::{
resolve_global_symbol, semantic_index, GlobalSymbolId, ScopeId, ScopeKind, SymbolId,
};
use crate::{FxDashMap, FxIndexSet, Name};
use ruff_index::{newtype_index, IndexVec};
use ruff_python_ast as ast;
use rustc_hash::FxHashMap;
pub(crate) mod infer;
pub(crate) use infer::{infer_definition_type, infer_symbol_type};
pub(crate) use infer::{infer_definition_type, infer_symbol_public_type};
/// unique ID for a type
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
pub enum Type {
/// the dynamic or gradual type: a statically-unknown set of values
/// the dynamic type: a statically-unknown set of values
Any,
/// the empty set of values
Never,
@@ -23,14 +27,19 @@ pub enum Type {
Unknown,
/// name is not bound to any value
Unbound,
/// the None object (TODO remove this in favor of Instance(types.NoneType)
None,
/// a specific function object
Function(FunctionTypeId),
/// a specific module object
Module(ModuleTypeId),
/// a specific class object
Class(ClassTypeId),
/// the set of Python objects with the given class in their __class__'s method resolution order
Instance(ClassTypeId),
Union(UnionTypeId),
Intersection(IntersectionTypeId),
IntLiteral(i64),
// TODO protocols, callable types, overloads, generics, type vars
}
@@ -46,6 +55,90 @@ impl Type {
pub const fn is_unknown(&self) -> bool {
matches!(self, Type::Unknown)
}
pub fn get_member(&self, db: &dyn SemanticDb, name: &Name) -> QueryResult<Option<Type>> {
match self {
Type::Any => Ok(Some(Type::Any)),
Type::Never => todo!("attribute lookup on Never type"),
Type::Unknown => Ok(Some(Type::Unknown)),
Type::Unbound => todo!("attribute lookup on Unbound type"),
Type::None => todo!("attribute lookup on None type"),
Type::Function(_) => todo!("attribute lookup on Function type"),
Type::Module(module_id) => module_id.get_member(db, name),
Type::Class(class_id) => class_id.get_class_member(db, name),
Type::Instance(_) => {
// TODO MRO? get_own_instance_member, get_instance_member
todo!("attribute lookup on Instance type")
}
Type::Union(union_id) => {
let jar: &SemanticJar = db.jar()?;
let _todo_union_ref = jar.type_store.get_union(*union_id);
// TODO perform the get_member on each type in the union
// TODO return the union of those results
// TODO if any of those results is `None` then include Unknown in the result union
todo!("attribute lookup on Union type")
}
Type::Intersection(_) => {
// TODO perform the get_member on each type in the intersection
// TODO return the intersection of those results
todo!("attribute lookup on Intersection type")
}
Type::IntLiteral(_) => {
// TODO raise error
Ok(Some(Type::Unknown))
}
}
}
// when this is fully fleshed out, it will use the db arg and may return QueryError
#[allow(clippy::unnecessary_wraps)]
pub fn resolve_bin_op(
&self,
_db: &dyn SemanticDb,
op: ast::Operator,
right_ty: Type,
) -> QueryResult<Type> {
match self {
Type::Any => Ok(Type::Any),
Type::Unknown => Ok(Type::Unknown),
Type::IntLiteral(n) => {
match right_ty {
Type::IntLiteral(m) => {
match op {
ast::Operator::Add => Ok(n
.checked_add(m)
.map(Type::IntLiteral)
// TODO builtins.int
.unwrap_or(Type::Unknown)),
ast::Operator::Sub => Ok(n
.checked_sub(m)
.map(Type::IntLiteral)
// TODO builtins.int
.unwrap_or(Type::Unknown)),
ast::Operator::Mult => Ok(n
.checked_mul(m)
.map(Type::IntLiteral)
// TODO builtins.int
.unwrap_or(Type::Unknown)),
ast::Operator::Div => Ok(n
.checked_div(m)
.map(Type::IntLiteral)
// TODO builtins.int
.unwrap_or(Type::Unknown)),
ast::Operator::Mod => Ok(n
.checked_rem(m)
.map(Type::IntLiteral)
// TODO division by zero error
.unwrap_or(Type::Unknown)),
_ => todo!("complete binop op support for IntLiteral"),
}
}
_ => todo!("complete binop right_ty support for IntLiteral"),
}
}
_ => todo!("complete binop support"),
}
}
}
impl From<FunctionTypeId> for Type {
@@ -80,7 +173,7 @@ impl TypeStore {
self.modules.remove(&file_id);
}
pub fn cache_symbol_type(&self, symbol: GlobalSymbolId, ty: Type) {
pub fn cache_symbol_public_type(&self, symbol: GlobalSymbolId, ty: Type) {
self.add_or_get_module(symbol.file_id)
.symbol_types
.insert(symbol.symbol_id, ty);
@@ -92,7 +185,7 @@ impl TypeStore {
.insert(node_key, ty);
}
pub fn get_cached_symbol_type(&self, symbol: GlobalSymbolId) -> Option<Type> {
pub fn get_cached_symbol_public_type(&self, symbol: GlobalSymbolId) -> Option<Type> {
self.try_get_module(symbol.file_id)?
.symbol_types
.get(&symbol.symbol_id)
@@ -120,7 +213,7 @@ impl TypeStore {
self.modules.get(&file_id)
}
fn add_function(
fn add_function_type(
&self,
file_id: FileId,
name: &str,
@@ -132,7 +225,18 @@ impl TypeStore {
.add_function(name, symbol_id, scope_id, decorators)
}
fn add_class(
fn add_function(
&self,
file_id: FileId,
name: &str,
symbol_id: SymbolId,
scope_id: ScopeId,
decorators: Vec<Type>,
) -> Type {
Type::Function(self.add_function_type(file_id, name, symbol_id, scope_id, decorators))
}
fn add_class_type(
&self,
file_id: FileId,
name: &str,
@@ -143,12 +247,36 @@ impl TypeStore {
.add_class(name, scope_id, bases)
}
fn add_union(&mut self, file_id: FileId, elems: &[Type]) -> UnionTypeId {
fn add_class(&self, file_id: FileId, name: &str, scope_id: ScopeId, bases: Vec<Type>) -> Type {
Type::Class(self.add_class_type(file_id, name, scope_id, bases))
}
/// add "raw" union type with exactly given elements
fn add_union_type(&self, file_id: FileId, elems: &[Type]) -> UnionTypeId {
self.add_or_get_module(file_id).add_union(elems)
}
fn add_intersection(
&mut self,
/// add union with normalization; may not return a `UnionType`
fn add_union(&self, file_id: FileId, elems: &[Type]) -> Type {
let mut flattened = Vec::with_capacity(elems.len());
for ty in elems {
match ty {
Type::Union(union_id) => flattened.extend(union_id.elements(self)),
_ => flattened.push(*ty),
}
}
// TODO don't add identical unions
// TODO de-duplicate union elements
match flattened[..] {
[] => Type::Never,
[ty] => ty,
_ => Type::Union(self.add_union_type(file_id, &flattened)),
}
}
/// add "raw" intersection type with exactly given elements
fn add_intersection_type(
&self,
file_id: FileId,
positive: &[Type],
negative: &[Type],
@@ -157,6 +285,38 @@ impl TypeStore {
.add_intersection(positive, negative)
}
/// add intersection with normalization; may not return an `IntersectionType`
fn add_intersection(&self, file_id: FileId, positive: &[Type], negative: &[Type]) -> Type {
let mut pos_flattened = Vec::with_capacity(positive.len());
let mut neg_flattened = Vec::with_capacity(negative.len());
for ty in positive {
match ty {
Type::Intersection(intersection_id) => {
pos_flattened.extend(intersection_id.positive(self));
neg_flattened.extend(intersection_id.negative(self));
}
_ => pos_flattened.push(*ty),
}
}
for ty in negative {
match ty {
Type::Intersection(intersection_id) => {
pos_flattened.extend(intersection_id.negative(self));
neg_flattened.extend(intersection_id.positive(self));
}
_ => neg_flattened.push(*ty),
}
}
// TODO don't add identical intersections
// TODO deduplicate intersection elements
// TODO maintain DNF form (union of intersections)
match (&pos_flattened[..], &neg_flattened[..]) {
([], []) => Type::Any, // TODO should be object
([ty], []) => *ty,
(pos, neg) => Type::Intersection(self.add_intersection_type(file_id, pos, neg)),
}
}
fn get_function(&self, id: FunctionTypeId) -> FunctionTypeRef {
FunctionTypeRef {
module_store: self.get_module(id.file_id),
@@ -292,10 +452,11 @@ impl FunctionTypeId {
self,
db: &dyn SemanticDb,
) -> QueryResult<Option<ClassTypeId>> {
let table = symbol_table(db, self.file_id)?;
let index = semantic_index(db, self.file_id)?;
let table = index.symbol_table();
let FunctionType { symbol_id, .. } = *self.function(db)?;
let scope_id = symbol_id.symbol(&table).scope_id();
let scope = scope_id.scope(&table);
let scope_id = symbol_id.symbol(table).scope_id();
let scope = scope_id.scope(table);
if !matches!(scope.kind(), ScopeKind::Class) {
return Ok(None);
};
@@ -336,6 +497,31 @@ impl FunctionTypeId {
}
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct ModuleTypeId {
module: Module,
file_id: FileId,
}
impl ModuleTypeId {
fn module(self, db: &dyn SemanticDb) -> QueryResult<ModuleStoreRef> {
let jar: &SemanticJar = db.jar()?;
Ok(jar.type_store.add_or_get_module(self.file_id).downgrade())
}
pub(crate) fn name(self, db: &dyn SemanticDb) -> QueryResult<ModuleName> {
self.module.name(db)
}
fn get_member(self, db: &dyn SemanticDb, name: &Name) -> QueryResult<Option<Type>> {
if let Some(symbol_id) = resolve_global_symbol(db, self.module, name)? {
Ok(Some(infer_symbol_public_type(db, symbol_id)?))
} else {
Ok(None)
}
}
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct ClassTypeId {
file_id: FileId,
@@ -375,9 +561,9 @@ impl ClassTypeId {
fn get_own_class_member(self, db: &dyn SemanticDb, name: &Name) -> QueryResult<Option<Type>> {
// TODO: this should distinguish instance-only members (e.g. `x: int`) and not return them
let ClassType { scope_id, .. } = *self.class(db)?;
let table = symbol_table(db, self.file_id)?;
if let Some(symbol_id) = table.symbol_id_by_name(scope_id, name) {
Ok(Some(infer_symbol_type(
let index = semantic_index(db, self.file_id)?;
if let Some(symbol_id) = index.symbol_table().symbol_id_by_name(scope_id, name) {
Ok(Some(infer_symbol_public_type(
db,
GlobalSymbolId {
file_id: self.file_id,
@@ -389,7 +575,13 @@ impl ClassTypeId {
}
}
// TODO: get_own_instance_member, get_class_member, get_instance_member
/// Get own class member or fall back to super-class member.
fn get_class_member(self, db: &dyn SemanticDb, name: &Name) -> QueryResult<Option<Type>> {
self.get_own_class_member(db, name)
.or_else(|_| self.get_super_class_member(db, name))
}
// TODO: get_own_instance_member, get_instance_member
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
@@ -398,12 +590,31 @@ pub struct UnionTypeId {
union_id: ModuleUnionTypeId,
}
impl UnionTypeId {
pub fn elements(self, type_store: &TypeStore) -> Vec<Type> {
let union = type_store.get_union(self);
union.elements.iter().copied().collect()
}
}
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq)]
pub struct IntersectionTypeId {
file_id: FileId,
intersection_id: ModuleIntersectionTypeId,
}
impl IntersectionTypeId {
pub fn positive(self, type_store: &TypeStore) -> Vec<Type> {
let intersection = type_store.get_intersection(self);
intersection.positive.iter().copied().collect()
}
pub fn negative(self, type_store: &TypeStore) -> Vec<Type> {
let intersection = type_store.get_intersection(self);
intersection.negative.iter().copied().collect()
}
}
#[newtype_index]
struct ModuleFunctionTypeId;
@@ -427,7 +638,7 @@ struct ModuleTypeStore {
unions: IndexVec<ModuleUnionTypeId, UnionType>,
/// arena of all intersection types created in this module
intersections: IndexVec<ModuleIntersectionTypeId, IntersectionType>,
/// cached types of symbols in this module
/// cached public types of symbols in this module
symbol_types: FxHashMap<SymbolId, Type>,
/// cached types of AST nodes in this module
node_types: FxHashMap<NodeKey, Type>,
@@ -529,6 +740,11 @@ impl std::fmt::Display for DisplayType<'_> {
Type::Never => f.write_str("Never"),
Type::Unknown => f.write_str("Unknown"),
Type::Unbound => f.write_str("Unbound"),
Type::None => f.write_str("None"),
Type::Module(module_id) => {
// NOTE: something like this?: "<module 'module-name' from 'path-from-fileid'>"
todo!("{module_id:?}")
}
// TODO functions and classes should display using a fully qualified name
Type::Class(class_id) => {
f.write_str("Literal[")?;
@@ -547,6 +763,7 @@ impl std::fmt::Display for DisplayType<'_> {
.get_module(int_id.file_id)
.get_intersection(int_id.intersection_id)
.display(f, self.store),
Type::IntLiteral(n) => write!(f, "Literal[{n}]"),
}
}
}
@@ -605,16 +822,42 @@ pub(crate) struct UnionType {
impl UnionType {
fn display(&self, f: &mut std::fmt::Formatter<'_>, store: &TypeStore) -> std::fmt::Result {
f.write_str("(")?;
let (int_literals, other_types): (Vec<Type>, Vec<Type>) = self
.elements
.iter()
.copied()
.partition(|ty| matches!(ty, Type::IntLiteral(_)));
let mut first = true;
for ty in &self.elements {
if !int_literals.is_empty() {
f.write_str("Literal[")?;
let mut nums: Vec<i64> = int_literals
.into_iter()
.filter_map(|ty| {
if let Type::IntLiteral(n) = ty {
Some(n)
} else {
None
}
})
.collect();
nums.sort_unstable();
for num in nums {
if !first {
f.write_str(", ")?;
}
write!(f, "{num}")?;
first = false;
}
f.write_str("]")?;
}
for ty in other_types {
if !first {
f.write_str(" | ")?;
};
first = false;
write!(f, "{}", ty.display(store))?;
}
f.write_str(")")
Ok(())
}
}
@@ -634,7 +877,6 @@ pub(crate) struct IntersectionType {
impl IntersectionType {
fn display(&self, f: &mut std::fmt::Formatter<'_>, store: &TypeStore) -> std::fmt::Result {
f.write_str("(")?;
let mut first = true;
for (neg, ty) in self
.positive
@@ -651,25 +893,79 @@ impl IntersectionType {
};
write!(f, "{}", ty.display(store))?;
}
f.write_str(")")
Ok(())
}
}
#[cfg(test)]
mod tests {
use super::Type;
use std::path::Path;
use crate::files::Files;
use crate::symbols::{SymbolFlags, SymbolTable};
use crate::types::{Type, TypeStore};
use crate::semantic::symbol_table::SymbolTableBuilder;
use crate::semantic::{FileId, ScopeId, SymbolFlags, SymbolTable, TypeStore};
use crate::FxIndexSet;
struct TestCase {
store: TypeStore,
files: Files,
file_id: FileId,
root_scope: ScopeId,
}
fn create_test() -> TestCase {
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
TestCase {
store: TypeStore::default(),
files,
file_id,
root_scope: SymbolTable::root_scope_id(),
}
}
fn assert_union_elements(store: &TypeStore, union: Type, elements: &[Type]) {
let Type::Union(union_id) = union else {
panic!("should be a union")
};
assert_eq!(
store.get_union(union_id).elements,
elements.iter().copied().collect::<FxIndexSet<_>>()
);
}
fn assert_intersection_elements(
store: &TypeStore,
intersection: Type,
positive: &[Type],
negative: &[Type],
) {
let Type::Intersection(intersection_id) = intersection else {
panic!("should be a intersection")
};
assert_eq!(
store.get_intersection(intersection_id).positive,
positive.iter().copied().collect::<FxIndexSet<_>>()
);
assert_eq!(
store.get_intersection(intersection_id).negative,
negative.iter().copied().collect::<FxIndexSet<_>>()
);
}
#[test]
fn add_class() {
let store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let id = store.add_class(file_id, "C", SymbolTable::root_scope_id(), Vec::new());
let TestCase {
store,
file_id,
root_scope,
..
} = create_test();
let id = store.add_class_type(file_id, "C", root_scope, Vec::new());
assert_eq!(store.get_class(id).name(), "C");
let inst = Type::Instance(id);
assert_eq!(format!("{}", inst.display(&store)), "C");
@@ -677,21 +973,26 @@ mod tests {
#[test]
fn add_function() {
let store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let mut table = SymbolTable::new();
let func_symbol = table.add_or_update_symbol(
let TestCase {
store,
file_id,
root_scope,
..
} = create_test();
let mut builder = SymbolTableBuilder::new();
let func_symbol = builder.add_or_update_symbol(
SymbolTable::root_scope_id(),
"func",
SymbolFlags::IS_DEFINED,
);
builder.finish();
let id = store.add_function(
let id = store.add_function_type(
file_id,
"func",
func_symbol,
SymbolTable::root_scope_id(),
root_scope,
vec![Type::Unknown],
);
assert_eq!(store.get_function(id).name(), "func");
@@ -702,44 +1003,117 @@ mod tests {
#[test]
fn add_union() {
let mut store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let c1 = store.add_class(file_id, "C1", SymbolTable::root_scope_id(), Vec::new());
let c2 = store.add_class(file_id, "C2", SymbolTable::root_scope_id(), Vec::new());
let TestCase {
store,
file_id,
root_scope,
..
} = create_test();
let c1 = store.add_class_type(file_id, "C1", root_scope, Vec::new());
let c2 = store.add_class_type(file_id, "C2", root_scope, Vec::new());
let elems = vec![Type::Instance(c1), Type::Instance(c2)];
let id = store.add_union(file_id, &elems);
assert_eq!(
store.get_union(id).elements,
elems.into_iter().collect::<FxIndexSet<_>>()
);
let id = store.add_union_type(file_id, &elems);
let union = Type::Union(id);
assert_eq!(format!("{}", union.display(&store)), "(C1 | C2)");
assert_union_elements(&store, union, &elems);
assert_eq!(format!("{}", union.display(&store)), "C1 | C2");
}
#[test]
fn add_intersection() {
let mut store = TypeStore::default();
let files = Files::default();
let file_id = files.intern(Path::new("/foo"));
let c1 = store.add_class(file_id, "C1", SymbolTable::root_scope_id(), Vec::new());
let c2 = store.add_class(file_id, "C2", SymbolTable::root_scope_id(), Vec::new());
let c3 = store.add_class(file_id, "C3", SymbolTable::root_scope_id(), Vec::new());
let TestCase {
store,
file_id,
root_scope,
..
} = create_test();
let c1 = store.add_class_type(file_id, "C1", root_scope, Vec::new());
let c2 = store.add_class_type(file_id, "C2", root_scope, Vec::new());
let c3 = store.add_class_type(file_id, "C3", root_scope, Vec::new());
let pos = vec![Type::Instance(c1), Type::Instance(c2)];
let neg = vec![Type::Instance(c3)];
let id = store.add_intersection(file_id, &pos, &neg);
assert_eq!(
store.get_intersection(id).positive,
pos.into_iter().collect::<FxIndexSet<_>>()
);
assert_eq!(
store.get_intersection(id).negative,
neg.into_iter().collect::<FxIndexSet<_>>()
);
let id = store.add_intersection_type(file_id, &pos, &neg);
let intersection = Type::Intersection(id);
assert_eq!(
format!("{}", intersection.display(&store)),
"(C1 & C2 & ~C3)"
);
assert_intersection_elements(&store, intersection, &pos, &neg);
assert_eq!(format!("{}", intersection.display(&store)), "C1 & C2 & ~C3");
}
#[test]
fn flatten_union_zero_elements() {
let TestCase { store, file_id, .. } = create_test();
let ty = store.add_union(file_id, &[]);
assert!(matches!(ty, Type::Never), "{ty:?} should be Never");
}
#[test]
fn flatten_union_one_element() {
let TestCase { store, file_id, .. } = create_test();
let ty = store.add_union(file_id, &[Type::None]);
assert!(matches!(ty, Type::None), "{ty:?} should be None");
}
#[test]
fn flatten_nested_union() {
let TestCase { store, file_id, .. } = create_test();
let l1 = Type::IntLiteral(1);
let l2 = Type::IntLiteral(2);
let u1 = store.add_union(file_id, &[l1, l2]);
let u2 = store.add_union(file_id, &[u1, Type::None]);
assert_union_elements(&store, u2, &[l1, l2, Type::None]);
}
#[test]
fn flatten_intersection_zero_elements() {
let TestCase { store, file_id, .. } = create_test();
let ty = store.add_intersection(file_id, &[], &[]);
// TODO should be object, not Any
assert!(matches!(ty, Type::Any), "{ty:?} should be object");
}
#[test]
fn flatten_intersection_one_positive_element() {
let TestCase { store, file_id, .. } = create_test();
let ty = store.add_intersection(file_id, &[Type::None], &[]);
assert!(matches!(ty, Type::None), "{ty:?} should be None");
}
#[test]
fn flatten_intersection_one_negative_element() {
let TestCase { store, file_id, .. } = create_test();
let ty = store.add_intersection(file_id, &[], &[Type::None]);
assert_intersection_elements(&store, ty, &[], &[Type::None]);
}
#[test]
fn flatten_nested_intersection() {
let TestCase {
store,
file_id,
root_scope,
..
} = create_test();
let c1 = Type::Instance(store.add_class_type(file_id, "C1", root_scope, vec![]));
let c2 = Type::Instance(store.add_class_type(file_id, "C2", root_scope, vec![]));
let c1sub = Type::Instance(store.add_class_type(file_id, "C1sub", root_scope, vec![c1]));
let i1 = store.add_intersection(file_id, &[c1, c2], &[c1sub]);
let i2 = store.add_intersection(file_id, &[i1, Type::None], &[]);
assert_intersection_elements(&store, i2, &[c1, c2, Type::None], &[c1sub]);
}
}

View File

@@ -0,0 +1,762 @@
#![allow(dead_code)]
use ruff_python_ast as ast;
use ruff_python_ast::AstNode;
use std::fmt::Debug;
use crate::db::{QueryResult, SemanticDb, SemanticJar};
use crate::module::{resolve_module, ModuleName};
use crate::parse::parse;
use crate::semantic::types::{ModuleTypeId, Type};
use crate::semantic::{
resolve_global_symbol, semantic_index, ConstrainedDefinition, Definition, GlobalSymbolId,
ImportDefinition, ImportFromDefinition,
};
use crate::{FileId, Name};
// FIXME: Figure out proper dead-lock free synchronisation now that this takes `&db` instead of `&mut db`.
/// Resolve the public-facing type for a symbol (the type seen by other scopes: other modules, or
/// nested functions). Because calls to nested functions and imports can occur anywhere in control
/// flow, this type must be conservative and consider all definitions of the symbol that could
/// possibly be seen by another scope. Currently we take the most conservative approach, which is
/// the union of all definitions. We may be able to narrow this in future to eliminate definitions
/// which can't possibly (or at least likely) be seen by any other scope, so that e.g. we could
/// infer `Literal["1"]` instead of `Literal[1] | Literal["1"]` for `x` in `x = x; x = str(x);`.
#[tracing::instrument(level = "trace", skip(db))]
pub fn infer_symbol_public_type(db: &dyn SemanticDb, symbol: GlobalSymbolId) -> QueryResult<Type> {
let index = semantic_index(db, symbol.file_id)?;
let defs = index.symbol_table().definitions(symbol.symbol_id).to_vec();
let jar: &SemanticJar = db.jar()?;
if let Some(ty) = jar.type_store.get_cached_symbol_public_type(symbol) {
return Ok(ty);
}
let ty = infer_type_from_definitions(db, symbol, defs.iter().cloned())?;
jar.type_store.cache_symbol_public_type(symbol, ty);
// TODO record dependencies
Ok(ty)
}
/// Infer type of a symbol as union of the given `Definitions`.
fn infer_type_from_definitions<T>(
db: &dyn SemanticDb,
symbol: GlobalSymbolId,
definitions: T,
) -> QueryResult<Type>
where
T: Debug + IntoIterator<Item = Definition>,
{
infer_type_from_constrained_definitions(
db,
symbol,
definitions
.into_iter()
.map(|definition| ConstrainedDefinition {
definition,
constraints: vec![],
}),
)
}
/// Infer type of a symbol as union of the given `ConstrainedDefinitions`.
fn infer_type_from_constrained_definitions<T>(
db: &dyn SemanticDb,
symbol: GlobalSymbolId,
constrained_definitions: T,
) -> QueryResult<Type>
where
T: IntoIterator<Item = ConstrainedDefinition>,
{
let jar: &SemanticJar = db.jar()?;
let mut tys = constrained_definitions
.into_iter()
.map(|def| infer_constrained_definition_type(db, symbol, def.clone()))
.peekable();
if let Some(first) = tys.next() {
if tys.peek().is_some() {
Ok(jar.type_store.add_union(
symbol.file_id,
&Iterator::chain(std::iter::once(first), tys).collect::<QueryResult<Vec<_>>>()?,
))
} else {
first
}
} else {
Ok(Type::Unknown)
}
}
/// Infer type for a ConstrainedDefinition (intersection of the definition type and the
/// constraints)
#[tracing::instrument(level = "trace", skip(db))]
pub fn infer_constrained_definition_type(
db: &dyn SemanticDb,
symbol: GlobalSymbolId,
constrained_definition: ConstrainedDefinition,
) -> QueryResult<Type> {
let ConstrainedDefinition {
definition,
constraints,
} = constrained_definition;
let index = semantic_index(db, symbol.file_id)?;
let parsed = parse(db.upcast(), symbol.file_id)?;
let mut intersected_types = vec![infer_definition_type(db, symbol, definition)?];
for constraint in constraints {
if let Some(constraint_type) = infer_constraint_type(
db,
symbol,
index.resolve_expression_id(parsed.syntax(), constraint),
)? {
intersected_types.push(constraint_type);
}
}
let jar: &SemanticJar = db.jar()?;
Ok(jar
.type_store
.add_intersection(symbol.file_id, &intersected_types, &[]))
}
/// Infer a type for a Definition
#[tracing::instrument(level = "trace", skip(db))]
pub fn infer_definition_type(
db: &dyn SemanticDb,
symbol: GlobalSymbolId,
definition: Definition,
) -> QueryResult<Type> {
let jar: &SemanticJar = db.jar()?;
let type_store = &jar.type_store;
let file_id = symbol.file_id;
match definition {
Definition::Unbound => Ok(Type::Unbound),
Definition::Import(ImportDefinition {
module: module_name,
}) => {
if let Some(module) = resolve_module(db, module_name.clone())? {
Ok(Type::Module(ModuleTypeId { module, file_id }))
} else {
Ok(Type::Unknown)
}
}
Definition::ImportFrom(ImportFromDefinition {
module,
name,
level,
}) => {
// TODO relative imports
assert!(matches!(level, 0));
let module_name = ModuleName::new(module.as_ref().expect("TODO relative imports"));
let Some(module) = resolve_module(db, module_name.clone())? else {
return Ok(Type::Unknown);
};
if let Some(remote_symbol) = resolve_global_symbol(db, module, &name)? {
infer_symbol_public_type(db, remote_symbol)
} else {
Ok(Type::Unknown)
}
}
Definition::ClassDef(node_key) => {
if let Some(ty) = type_store.get_cached_node_type(file_id, node_key.erased()) {
Ok(ty)
} else {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.syntax();
let index = semantic_index(db, file_id)?;
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
let mut bases = Vec::with_capacity(node.bases().len());
for base in node.bases() {
bases.push(infer_expr_type(db, file_id, base)?);
}
let scope_id = index.symbol_table().scope_id_for_node(node_key.erased());
let ty = type_store.add_class(file_id, &node.name.id, scope_id, bases);
type_store.cache_node_type(file_id, *node_key.erased(), ty);
Ok(ty)
}
}
Definition::FunctionDef(node_key) => {
if let Some(ty) = type_store.get_cached_node_type(file_id, node_key.erased()) {
Ok(ty)
} else {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.syntax();
let index = semantic_index(db, file_id)?;
let node = node_key
.resolve(ast.as_any_node_ref())
.expect("node key should resolve");
let decorator_tys = node
.decorator_list
.iter()
.map(|decorator| infer_expr_type(db, file_id, &decorator.expression))
.collect::<QueryResult<_>>()?;
let scope_id = index.symbol_table().scope_id_for_node(node_key.erased());
let ty = type_store.add_function(
file_id,
&node.name.id,
symbol.symbol_id,
scope_id,
decorator_tys,
);
type_store.cache_node_type(file_id, *node_key.erased(), ty);
Ok(ty)
}
}
Definition::Assignment(node_key) => {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.syntax();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
// TODO handle unpacking assignment
infer_expr_type(db, file_id, &node.value)
}
Definition::AnnotatedAssignment(node_key) => {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.syntax();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
// TODO actually look at the annotation
let Some(value) = &node.value else {
return Ok(Type::Unknown);
};
// TODO handle unpacking assignment
infer_expr_type(db, file_id, value)
}
Definition::NamedExpr(node_key) => {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.syntax();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
infer_expr_type(db, file_id, &node.value)
}
}
}
/// Return the type that the given constraint (an expression from a control-flow test) requires the
/// given symbol to have. For example, returns the Type "~None" as the constraint type if given the
/// symbol ID for x and the expression ID for `x is not None`. Returns (Rust) None if the given
/// expression applies no constraints on the given symbol.
#[tracing::instrument(level = "trace", skip(db))]
fn infer_constraint_type(
db: &dyn SemanticDb,
symbol_id: GlobalSymbolId,
// TODO this should preferably take an &ast::Expr instead of AnyNodeRef
expression: ast::AnyNodeRef,
) -> QueryResult<Option<Type>> {
let file_id = symbol_id.file_id;
let index = semantic_index(db, file_id)?;
let jar: &SemanticJar = db.jar()?;
let symbol_name = symbol_id.symbol_id.symbol(&index.symbol_table).name();
// TODO narrowing attributes
// TODO narrowing dict keys
// TODO isinstance, ==/!=, type(...), literals, bools...
match expression {
ast::AnyNodeRef::ExprCompare(ast::ExprCompare {
left,
ops,
comparators,
..
}) => {
// TODO chained comparisons
match left.as_ref() {
ast::Expr::Name(ast::ExprName { id, .. }) if id == symbol_name => match ops[0] {
ast::CmpOp::Is | ast::CmpOp::IsNot => {
Ok(match infer_expr_type(db, file_id, &comparators[0])? {
Type::None => Some(Type::None),
_ => None,
}
.map(|ty| {
if matches!(ops[0], ast::CmpOp::IsNot) {
jar.type_store.add_intersection(file_id, &[], &[ty])
} else {
ty
}
}))
}
_ => Ok(None),
},
_ => Ok(None),
}
}
_ => Ok(None),
}
}
/// Infer type of the given expression.
fn infer_expr_type(db: &dyn SemanticDb, file_id: FileId, expr: &ast::Expr) -> QueryResult<Type> {
// TODO cache the resolution of the type on the node
let index = semantic_index(db, file_id)?;
match expr {
ast::Expr::NoneLiteral(_) => Ok(Type::None),
ast::Expr::NumberLiteral(ast::ExprNumberLiteral { value, .. }) => {
match value {
ast::Number::Int(n) => {
// TODO support big int literals
Ok(n.as_i64().map(Type::IntLiteral).unwrap_or(Type::Unknown))
}
// TODO builtins.float or builtins.complex
_ => Ok(Type::Unknown),
}
}
ast::Expr::Name(name) => {
// TODO look up in the correct scope, don't assume global
if let Some(symbol_id) = index.symbol_table().root_symbol_id_by_name(&name.id) {
infer_type_from_constrained_definitions(
db,
GlobalSymbolId { file_id, symbol_id },
index.reachable_definitions(symbol_id, expr),
)
} else {
Ok(Type::Unknown)
}
}
ast::Expr::Attribute(ast::ExprAttribute { value, attr, .. }) => {
let value_type = infer_expr_type(db, file_id, value)?;
let attr_name = &Name::new(&attr.id);
value_type
.get_member(db, attr_name)
.map(|ty| ty.unwrap_or(Type::Unknown))
}
ast::Expr::BinOp(ast::ExprBinOp {
left, op, right, ..
}) => {
let left_ty = infer_expr_type(db, file_id, left)?;
let right_ty = infer_expr_type(db, file_id, right)?;
// TODO add reverse bin op support if right <: left
left_ty.resolve_bin_op(db, *op, right_ty)
}
ast::Expr::Named(ast::ExprNamed { value, .. }) => infer_expr_type(db, file_id, value),
ast::Expr::If(ast::ExprIf { body, orelse, .. }) => {
// TODO detect statically known truthy or falsy test
let body_ty = infer_expr_type(db, file_id, body)?;
let else_ty = infer_expr_type(db, file_id, orelse)?;
let jar: &SemanticJar = db.jar()?;
Ok(jar.type_store.add_union(file_id, &[body_ty, else_ty]))
}
_ => todo!("expression type resolution for {:?}", expr),
}
}
#[cfg(test)]
mod tests {
use std::path::PathBuf;
use crate::db::tests::TestDb;
use crate::db::{HasJar, SemanticJar};
use crate::module::{
resolve_module, set_module_search_paths, ModuleName, ModuleResolutionInputs,
};
use crate::semantic::{infer_symbol_public_type, resolve_global_symbol, Type};
use crate::Name;
// TODO with virtual filesystem we shouldn't have to write files to disk for these
// tests
struct TestCase {
temp_dir: tempfile::TempDir,
db: TestDb,
src: PathBuf,
}
fn create_test() -> std::io::Result<TestCase> {
let temp_dir = tempfile::tempdir()?;
let src = temp_dir.path().join("src");
std::fs::create_dir(&src)?;
let src = src.canonicalize()?;
let search_paths = ModuleResolutionInputs {
extra_paths: vec![],
workspace_root: src.clone(),
site_packages: None,
custom_typeshed: None,
};
let mut db = TestDb::default();
set_module_search_paths(&mut db, search_paths);
Ok(TestCase { temp_dir, db, src })
}
fn write_to_path(case: &TestCase, relative_path: &str, contents: &str) -> anyhow::Result<()> {
let path = case.src.join(relative_path);
std::fs::write(path, contents)?;
Ok(())
}
fn get_public_type(
case: &TestCase,
module_name: &str,
variable_name: &str,
) -> anyhow::Result<Type> {
let db = &case.db;
let module = resolve_module(db, ModuleName::new(module_name))?.expect("Module to exist");
let symbol = resolve_global_symbol(db, module, variable_name)?.expect("symbol to exist");
Ok(infer_symbol_public_type(db, symbol)?)
}
fn assert_public_type(
case: &TestCase,
module_name: &str,
variable_name: &str,
type_name: &str,
) -> anyhow::Result<()> {
let ty = get_public_type(case, module_name, variable_name)?;
let jar = HasJar::<SemanticJar>::jar(&case.db)?;
assert_eq!(format!("{}", ty.display(&jar.type_store)), type_name);
Ok(())
}
#[test]
fn follow_import_to_class() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(&case, "a.py", "from b import C as D; E = D")?;
write_to_path(&case, "b.py", "class C: pass")?;
assert_public_type(&case, "a", "E", "Literal[C]")
}
#[test]
fn resolve_base_class_by_name() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"mod.py",
"
class Base: pass
class Sub(Base): pass
",
)?;
let ty = get_public_type(&case, "mod", "Sub")?;
let Type::Class(class_id) = ty else {
panic!("Sub is not a Class")
};
let jar = HasJar::<SemanticJar>::jar(&case.db)?;
let base_names: Vec<_> = jar
.type_store
.get_class(class_id)
.bases()
.iter()
.map(|base_ty| format!("{}", base_ty.display(&jar.type_store)))
.collect();
assert_eq!(base_names, vec!["Literal[Base]"]);
Ok(())
}
#[test]
fn resolve_method() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"mod.py",
"
class C:
def f(self): pass
",
)?;
let ty = get_public_type(&case, "mod", "C")?;
let Type::Class(class_id) = ty else {
panic!("C is not a Class");
};
let member_ty = class_id
.get_own_class_member(&case.db, &Name::new("f"))
.expect("C.f to resolve");
let Some(Type::Function(func_id)) = member_ty else {
panic!("C.f is not a Function");
};
let jar = HasJar::<SemanticJar>::jar(&case.db)?;
let function = jar.type_store.get_function(func_id);
assert_eq!(function.name(), "f");
Ok(())
}
#[test]
fn resolve_module_member() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(&case, "a.py", "import b; D = b.C")?;
write_to_path(&case, "b.py", "class C: pass")?;
assert_public_type(&case, "a", "D", "Literal[C]")
}
#[test]
fn resolve_literal() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(&case, "a.py", "x = 1")?;
assert_public_type(&case, "a", "x", "Literal[1]")
}
#[test]
fn resolve_union() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
if flag:
x = 1
else:
x = 2
",
)?;
assert_public_type(&case, "a", "x", "Literal[1, 2]")
}
#[test]
fn resolve_visible_def() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(&case, "a.py", "y = 1; y = 2; x = y")?;
assert_public_type(&case, "a", "x", "Literal[2]")
}
#[test]
fn join_paths() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
y = 1
y = 2
if flag:
y = 3
x = y
",
)?;
assert_public_type(&case, "a", "x", "Literal[2, 3]")
}
#[test]
fn maybe_unbound() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
if flag:
y = 1
x = y
",
)?;
assert_public_type(&case, "a", "x", "Literal[1] | Unbound")
}
#[test]
fn if_elif_else() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
y = 1
y = 2
if flag:
y = 3
elif flag2:
y = 4
else:
r = y
y = 5
s = y
x = y
",
)?;
assert_public_type(&case, "a", "x", "Literal[3, 4, 5]")?;
assert_public_type(&case, "a", "r", "Literal[2]")?;
assert_public_type(&case, "a", "s", "Literal[5]")
}
#[test]
fn if_elif() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
y = 1
y = 2
if flag:
y = 3
elif flag2:
y = 4
x = y
",
)?;
assert_public_type(&case, "a", "x", "Literal[2, 3, 4]")
}
#[test]
fn literal_int_arithmetic() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
a = 2 + 1
b = a - 4
c = a * b
d = c / 3
e = 5 % 3
",
)?;
assert_public_type(&case, "a", "a", "Literal[3]")?;
assert_public_type(&case, "a", "b", "Literal[-1]")?;
assert_public_type(&case, "a", "c", "Literal[-3]")?;
assert_public_type(&case, "a", "d", "Literal[-1]")?;
assert_public_type(&case, "a", "e", "Literal[2]")
}
#[test]
fn walrus() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
x = (y := 1) + 1
",
)?;
assert_public_type(&case, "a", "x", "Literal[2]")?;
assert_public_type(&case, "a", "y", "Literal[1]")
}
#[test]
fn ifexpr() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
x = 1 if flag else 2
",
)?;
assert_public_type(&case, "a", "x", "Literal[1, 2]")
}
#[test]
fn ifexpr_walrus() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
y = z = 0
x = (y := 1) if flag else (z := 2)
a = y
b = z
",
)?;
assert_public_type(&case, "a", "x", "Literal[1, 2]")?;
assert_public_type(&case, "a", "a", "Literal[0, 1]")?;
assert_public_type(&case, "a", "b", "Literal[0, 2]")
}
#[test]
fn ifexpr_walrus_2() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
y = 0
(y := 1) if flag else (y := 2)
a = y
",
)?;
assert_public_type(&case, "a", "a", "Literal[1, 2]")
}
#[test]
fn ifexpr_nested() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
x = 1 if flag else 2 if flag2 else 3
",
)?;
assert_public_type(&case, "a", "x", "Literal[1, 2, 3]")
}
#[test]
fn none() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
x = 1 if flag else None
",
)?;
assert_public_type(&case, "a", "x", "Literal[1] | None")
}
#[test]
fn narrow_none() -> anyhow::Result<()> {
let case = create_test()?;
write_to_path(
&case,
"a.py",
"
x = 1 if flag else None
y = 0
if x is not None:
y = x
z = y
",
)?;
// TODO normalization of unions and intersections: this type is technically correct but
// begging for normalization
assert_public_type(&case, "a", "z", "Literal[0] | Literal[1] | None & ~None")
}
}

View File

@@ -53,6 +53,16 @@ pub enum SourceKind {
IpyNotebook(Arc<Notebook>),
}
impl<'a> From<&'a SourceKind> for PySourceType {
fn from(value: &'a SourceKind) -> Self {
match value {
SourceKind::Python(_) => PySourceType::Python,
SourceKind::Stub(_) => PySourceType::Stub,
SourceKind::IpyNotebook(_) => PySourceType::Ipynb,
}
}
}
#[derive(Debug, Clone, PartialEq)]
pub struct Source {
kind: SourceKind,

File diff suppressed because it is too large Load Diff

View File

@@ -1,292 +0,0 @@
#![allow(dead_code)]
use ruff_python_ast as ast;
use ruff_python_ast::AstNode;
use crate::db::{QueryResult, SemanticDb, SemanticJar};
use crate::module::ModuleName;
use crate::parse::parse;
use crate::symbols::{
resolve_global_symbol, symbol_table, Definition, GlobalSymbolId, ImportFromDefinition,
};
use crate::types::Type;
use crate::FileId;
// FIXME: Figure out proper dead-lock free synchronisation now that this takes `&db` instead of `&mut db`.
#[tracing::instrument(level = "trace", skip(db))]
pub fn infer_symbol_type(db: &dyn SemanticDb, symbol: GlobalSymbolId) -> QueryResult<Type> {
let symbols = symbol_table(db, symbol.file_id)?;
let defs = symbols.definitions(symbol.symbol_id);
let jar: &SemanticJar = db.jar()?;
if let Some(ty) = jar.type_store.get_cached_symbol_type(symbol) {
return Ok(ty);
}
// TODO handle multiple defs, conditional defs...
assert_eq!(defs.len(), 1);
let ty = infer_definition_type(db, symbol, defs[0].clone())?;
jar.type_store.cache_symbol_type(symbol, ty);
// TODO record dependencies
Ok(ty)
}
#[tracing::instrument(level = "trace", skip(db))]
pub fn infer_definition_type(
db: &dyn SemanticDb,
symbol: GlobalSymbolId,
definition: Definition,
) -> QueryResult<Type> {
let jar: &SemanticJar = db.jar()?;
let type_store = &jar.type_store;
let file_id = symbol.file_id;
match definition {
Definition::ImportFrom(ImportFromDefinition {
module,
name,
level,
}) => {
// TODO relative imports
assert!(matches!(level, 0));
let module_name = ModuleName::new(module.as_ref().expect("TODO relative imports"));
if let Some(remote_symbol) = resolve_global_symbol(db, module_name, &name)? {
infer_symbol_type(db, remote_symbol)
} else {
Ok(Type::Unknown)
}
}
Definition::ClassDef(node_key) => {
if let Some(ty) = type_store.get_cached_node_type(file_id, node_key.erased()) {
Ok(ty)
} else {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.ast();
let table = symbol_table(db, file_id)?;
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
let mut bases = Vec::with_capacity(node.bases().len());
for base in node.bases() {
bases.push(infer_expr_type(db, file_id, base)?);
}
let scope_id = table.scope_id_for_node(node_key.erased());
let ty = Type::Class(type_store.add_class(file_id, &node.name.id, scope_id, bases));
type_store.cache_node_type(file_id, *node_key.erased(), ty);
Ok(ty)
}
}
Definition::FunctionDef(node_key) => {
if let Some(ty) = type_store.get_cached_node_type(file_id, node_key.erased()) {
Ok(ty)
} else {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.ast();
let table = symbol_table(db, file_id)?;
let node = node_key
.resolve(ast.as_any_node_ref())
.expect("node key should resolve");
let decorator_tys = node
.decorator_list
.iter()
.map(|decorator| infer_expr_type(db, file_id, &decorator.expression))
.collect::<QueryResult<_>>()?;
let scope_id = table.scope_id_for_node(node_key.erased());
let ty = type_store
.add_function(
file_id,
&node.name.id,
symbol.symbol_id,
scope_id,
decorator_tys,
)
.into();
type_store.cache_node_type(file_id, *node_key.erased(), ty);
Ok(ty)
}
}
Definition::Assignment(node_key) => {
let parsed = parse(db.upcast(), file_id)?;
let ast = parsed.ast();
let node = node_key.resolve_unwrap(ast.as_any_node_ref());
// TODO handle unpacking assignment correctly
infer_expr_type(db, file_id, &node.value)
}
_ => todo!("other kinds of definitions"),
}
}
fn infer_expr_type(db: &dyn SemanticDb, file_id: FileId, expr: &ast::Expr) -> QueryResult<Type> {
// TODO cache the resolution of the type on the node
let symbols = symbol_table(db, file_id)?;
match expr {
ast::Expr::Name(name) => {
// TODO look up in the correct scope, don't assume global
if let Some(symbol_id) = symbols.root_symbol_id_by_name(&name.id) {
infer_symbol_type(db, GlobalSymbolId { file_id, symbol_id })
} else {
Ok(Type::Unknown)
}
}
_ => todo!("full expression type resolution"),
}
}
#[cfg(test)]
mod tests {
use crate::db::tests::TestDb;
use crate::db::{HasJar, SemanticJar};
use crate::module::{
resolve_module, set_module_search_paths, ModuleName, ModuleSearchPath, ModuleSearchPathKind,
};
use crate::symbols::{symbol_table, GlobalSymbolId};
use crate::types::{infer_symbol_type, Type};
use crate::Name;
// TODO with virtual filesystem we shouldn't have to write files to disk for these
// tests
struct TestCase {
temp_dir: tempfile::TempDir,
db: TestDb,
src: ModuleSearchPath,
}
fn create_test() -> std::io::Result<TestCase> {
let temp_dir = tempfile::tempdir()?;
let src = temp_dir.path().join("src");
std::fs::create_dir(&src)?;
let src = ModuleSearchPath::new(src.canonicalize()?, ModuleSearchPathKind::FirstParty);
let roots = vec![src.clone()];
let mut db = TestDb::default();
set_module_search_paths(&mut db, roots);
Ok(TestCase { temp_dir, db, src })
}
#[test]
fn follow_import_to_class() -> anyhow::Result<()> {
let case = create_test()?;
let db = &case.db;
let a_path = case.src.path().join("a.py");
let b_path = case.src.path().join("b.py");
std::fs::write(a_path, "from b import C as D; E = D")?;
std::fs::write(b_path, "class C: pass")?;
let a_file = resolve_module(db, ModuleName::new("a"))?
.expect("module should be found")
.path(db)?
.file();
let a_syms = symbol_table(db, a_file)?;
let e_sym = a_syms
.root_symbol_id_by_name("E")
.expect("E symbol should be found");
let ty = infer_symbol_type(
db,
GlobalSymbolId {
file_id: a_file,
symbol_id: e_sym,
},
)?;
let jar = HasJar::<SemanticJar>::jar(db)?;
assert!(matches!(ty, Type::Class(_)));
assert_eq!(format!("{}", ty.display(&jar.type_store)), "Literal[C]");
Ok(())
}
#[test]
fn resolve_base_class_by_name() -> anyhow::Result<()> {
let case = create_test()?;
let db = &case.db;
let path = case.src.path().join("mod.py");
std::fs::write(path, "class Base: pass\nclass Sub(Base): pass")?;
let file = resolve_module(db, ModuleName::new("mod"))?
.expect("module should be found")
.path(db)?
.file();
let syms = symbol_table(db, file)?;
let sym = syms
.root_symbol_id_by_name("Sub")
.expect("Sub symbol should be found");
let ty = infer_symbol_type(
db,
GlobalSymbolId {
file_id: file,
symbol_id: sym,
},
)?;
let Type::Class(class_id) = ty else {
panic!("Sub is not a Class")
};
let jar = HasJar::<SemanticJar>::jar(db)?;
let base_names: Vec<_> = jar
.type_store
.get_class(class_id)
.bases()
.iter()
.map(|base_ty| format!("{}", base_ty.display(&jar.type_store)))
.collect();
assert_eq!(base_names, vec!["Literal[Base]"]);
Ok(())
}
#[test]
fn resolve_method() -> anyhow::Result<()> {
let case = create_test()?;
let db = &case.db;
let path = case.src.path().join("mod.py");
std::fs::write(path, "class C:\n def f(self): pass")?;
let file = resolve_module(db, ModuleName::new("mod"))?
.expect("module should be found")
.path(db)?
.file();
let syms = symbol_table(db, file)?;
let sym = syms
.root_symbol_id_by_name("C")
.expect("C symbol should be found");
let ty = infer_symbol_type(
db,
GlobalSymbolId {
file_id: file,
symbol_id: sym,
},
)?;
let Type::Class(class_id) = ty else {
panic!("C is not a Class");
};
let member_ty = class_id
.get_own_class_member(db, &Name::new("f"))
.expect("C.f to resolve");
let Some(Type::Function(func_id)) = member_ty else {
panic!("C.f is not a Function");
};
let jar = HasJar::<SemanticJar>::jar(db)?;
let function = jar.type_store.get_function(func_id);
assert_eq!(function.name(), "f");
Ok(())
}
}

View File

@@ -1 +0,0 @@
a9d7e861f7a46ae7acd56569326adef302e10f29

View File

@@ -1,591 +0,0 @@
import sys
import typing_extensions
from typing import Any, ClassVar, Literal
PyCF_ONLY_AST: Literal[1024]
PyCF_TYPE_COMMENTS: Literal[4096]
PyCF_ALLOW_TOP_LEVEL_AWAIT: Literal[8192]
# Alias used for fields that must always be valid identifiers
# A string `x` counts as a valid identifier if both the following are True
# (1) `x.isidentifier()` evaluates to `True`
# (2) `keyword.iskeyword(x)` evaluates to `False`
_Identifier: typing_extensions.TypeAlias = str
class AST:
if sys.version_info >= (3, 10):
__match_args__ = ()
_attributes: ClassVar[tuple[str, ...]]
_fields: ClassVar[tuple[str, ...]]
def __init__(self, *args: Any, **kwargs: Any) -> None: ...
# TODO: Not all nodes have all of the following attributes
lineno: int
col_offset: int
end_lineno: int | None
end_col_offset: int | None
type_comment: str | None
class mod(AST): ...
class type_ignore(AST): ...
class TypeIgnore(type_ignore):
if sys.version_info >= (3, 10):
__match_args__ = ("lineno", "tag")
tag: str
class FunctionType(mod):
if sys.version_info >= (3, 10):
__match_args__ = ("argtypes", "returns")
argtypes: list[expr]
returns: expr
class Module(mod):
if sys.version_info >= (3, 10):
__match_args__ = ("body", "type_ignores")
body: list[stmt]
type_ignores: list[TypeIgnore]
class Interactive(mod):
if sys.version_info >= (3, 10):
__match_args__ = ("body",)
body: list[stmt]
class Expression(mod):
if sys.version_info >= (3, 10):
__match_args__ = ("body",)
body: expr
class stmt(AST): ...
class FunctionDef(stmt):
if sys.version_info >= (3, 12):
__match_args__ = ("name", "args", "body", "decorator_list", "returns", "type_comment", "type_params")
elif sys.version_info >= (3, 10):
__match_args__ = ("name", "args", "body", "decorator_list", "returns", "type_comment")
name: _Identifier
args: arguments
body: list[stmt]
decorator_list: list[expr]
returns: expr | None
if sys.version_info >= (3, 12):
type_params: list[type_param]
class AsyncFunctionDef(stmt):
if sys.version_info >= (3, 12):
__match_args__ = ("name", "args", "body", "decorator_list", "returns", "type_comment", "type_params")
elif sys.version_info >= (3, 10):
__match_args__ = ("name", "args", "body", "decorator_list", "returns", "type_comment")
name: _Identifier
args: arguments
body: list[stmt]
decorator_list: list[expr]
returns: expr | None
if sys.version_info >= (3, 12):
type_params: list[type_param]
class ClassDef(stmt):
if sys.version_info >= (3, 12):
__match_args__ = ("name", "bases", "keywords", "body", "decorator_list", "type_params")
elif sys.version_info >= (3, 10):
__match_args__ = ("name", "bases", "keywords", "body", "decorator_list")
name: _Identifier
bases: list[expr]
keywords: list[keyword]
body: list[stmt]
decorator_list: list[expr]
if sys.version_info >= (3, 12):
type_params: list[type_param]
class Return(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("value",)
value: expr | None
class Delete(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("targets",)
targets: list[expr]
class Assign(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("targets", "value", "type_comment")
targets: list[expr]
value: expr
class AugAssign(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("target", "op", "value")
target: Name | Attribute | Subscript
op: operator
value: expr
class AnnAssign(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("target", "annotation", "value", "simple")
target: Name | Attribute | Subscript
annotation: expr
value: expr | None
simple: int
class For(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("target", "iter", "body", "orelse", "type_comment")
target: expr
iter: expr
body: list[stmt]
orelse: list[stmt]
class AsyncFor(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("target", "iter", "body", "orelse", "type_comment")
target: expr
iter: expr
body: list[stmt]
orelse: list[stmt]
class While(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("test", "body", "orelse")
test: expr
body: list[stmt]
orelse: list[stmt]
class If(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("test", "body", "orelse")
test: expr
body: list[stmt]
orelse: list[stmt]
class With(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("items", "body", "type_comment")
items: list[withitem]
body: list[stmt]
class AsyncWith(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("items", "body", "type_comment")
items: list[withitem]
body: list[stmt]
class Raise(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("exc", "cause")
exc: expr | None
cause: expr | None
class Try(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("body", "handlers", "orelse", "finalbody")
body: list[stmt]
handlers: list[ExceptHandler]
orelse: list[stmt]
finalbody: list[stmt]
if sys.version_info >= (3, 11):
class TryStar(stmt):
__match_args__ = ("body", "handlers", "orelse", "finalbody")
body: list[stmt]
handlers: list[ExceptHandler]
orelse: list[stmt]
finalbody: list[stmt]
class Assert(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("test", "msg")
test: expr
msg: expr | None
class Import(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("names",)
names: list[alias]
class ImportFrom(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("module", "names", "level")
module: str | None
names: list[alias]
level: int
class Global(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("names",)
names: list[_Identifier]
class Nonlocal(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("names",)
names: list[_Identifier]
class Expr(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("value",)
value: expr
class Pass(stmt): ...
class Break(stmt): ...
class Continue(stmt): ...
class expr(AST): ...
class BoolOp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("op", "values")
op: boolop
values: list[expr]
class BinOp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("left", "op", "right")
left: expr
op: operator
right: expr
class UnaryOp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("op", "operand")
op: unaryop
operand: expr
class Lambda(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("args", "body")
args: arguments
body: expr
class IfExp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("test", "body", "orelse")
test: expr
body: expr
orelse: expr
class Dict(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("keys", "values")
keys: list[expr | None]
values: list[expr]
class Set(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("elts",)
elts: list[expr]
class ListComp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("elt", "generators")
elt: expr
generators: list[comprehension]
class SetComp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("elt", "generators")
elt: expr
generators: list[comprehension]
class DictComp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("key", "value", "generators")
key: expr
value: expr
generators: list[comprehension]
class GeneratorExp(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("elt", "generators")
elt: expr
generators: list[comprehension]
class Await(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value",)
value: expr
class Yield(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value",)
value: expr | None
class YieldFrom(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value",)
value: expr
class Compare(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("left", "ops", "comparators")
left: expr
ops: list[cmpop]
comparators: list[expr]
class Call(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("func", "args", "keywords")
func: expr
args: list[expr]
keywords: list[keyword]
class FormattedValue(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value", "conversion", "format_spec")
value: expr
conversion: int
format_spec: expr | None
class JoinedStr(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("values",)
values: list[expr]
class Constant(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value", "kind")
value: Any # None, str, bytes, bool, int, float, complex, Ellipsis
kind: str | None
# Aliases for value, for backwards compatibility
s: Any
n: int | float | complex
class NamedExpr(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("target", "value")
target: Name
value: expr
class Attribute(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value", "attr", "ctx")
value: expr
attr: _Identifier
ctx: expr_context
if sys.version_info >= (3, 9):
_Slice: typing_extensions.TypeAlias = expr
else:
class slice(AST): ...
_Slice: typing_extensions.TypeAlias = slice
class Slice(_Slice):
if sys.version_info >= (3, 10):
__match_args__ = ("lower", "upper", "step")
lower: expr | None
upper: expr | None
step: expr | None
if sys.version_info < (3, 9):
class ExtSlice(slice):
dims: list[slice]
class Index(slice):
value: expr
class Subscript(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value", "slice", "ctx")
value: expr
slice: _Slice
ctx: expr_context
class Starred(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("value", "ctx")
value: expr
ctx: expr_context
class Name(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("id", "ctx")
id: _Identifier
ctx: expr_context
class List(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("elts", "ctx")
elts: list[expr]
ctx: expr_context
class Tuple(expr):
if sys.version_info >= (3, 10):
__match_args__ = ("elts", "ctx")
elts: list[expr]
ctx: expr_context
if sys.version_info >= (3, 9):
dims: list[expr]
class expr_context(AST): ...
if sys.version_info < (3, 9):
class AugLoad(expr_context): ...
class AugStore(expr_context): ...
class Param(expr_context): ...
class Suite(mod):
body: list[stmt]
class Del(expr_context): ...
class Load(expr_context): ...
class Store(expr_context): ...
class boolop(AST): ...
class And(boolop): ...
class Or(boolop): ...
class operator(AST): ...
class Add(operator): ...
class BitAnd(operator): ...
class BitOr(operator): ...
class BitXor(operator): ...
class Div(operator): ...
class FloorDiv(operator): ...
class LShift(operator): ...
class Mod(operator): ...
class Mult(operator): ...
class MatMult(operator): ...
class Pow(operator): ...
class RShift(operator): ...
class Sub(operator): ...
class unaryop(AST): ...
class Invert(unaryop): ...
class Not(unaryop): ...
class UAdd(unaryop): ...
class USub(unaryop): ...
class cmpop(AST): ...
class Eq(cmpop): ...
class Gt(cmpop): ...
class GtE(cmpop): ...
class In(cmpop): ...
class Is(cmpop): ...
class IsNot(cmpop): ...
class Lt(cmpop): ...
class LtE(cmpop): ...
class NotEq(cmpop): ...
class NotIn(cmpop): ...
class comprehension(AST):
if sys.version_info >= (3, 10):
__match_args__ = ("target", "iter", "ifs", "is_async")
target: expr
iter: expr
ifs: list[expr]
is_async: int
class excepthandler(AST): ...
class ExceptHandler(excepthandler):
if sys.version_info >= (3, 10):
__match_args__ = ("type", "name", "body")
type: expr | None
name: _Identifier | None
body: list[stmt]
class arguments(AST):
if sys.version_info >= (3, 10):
__match_args__ = ("posonlyargs", "args", "vararg", "kwonlyargs", "kw_defaults", "kwarg", "defaults")
posonlyargs: list[arg]
args: list[arg]
vararg: arg | None
kwonlyargs: list[arg]
kw_defaults: list[expr | None]
kwarg: arg | None
defaults: list[expr]
class arg(AST):
if sys.version_info >= (3, 10):
__match_args__ = ("arg", "annotation", "type_comment")
arg: _Identifier
annotation: expr | None
class keyword(AST):
if sys.version_info >= (3, 10):
__match_args__ = ("arg", "value")
arg: _Identifier | None
value: expr
class alias(AST):
if sys.version_info >= (3, 10):
__match_args__ = ("name", "asname")
name: str
asname: _Identifier | None
class withitem(AST):
if sys.version_info >= (3, 10):
__match_args__ = ("context_expr", "optional_vars")
context_expr: expr
optional_vars: expr | None
if sys.version_info >= (3, 10):
class Match(stmt):
__match_args__ = ("subject", "cases")
subject: expr
cases: list[match_case]
class pattern(AST): ...
# Without the alias, Pyright complains variables named pattern are recursively defined
_Pattern: typing_extensions.TypeAlias = pattern
class match_case(AST):
__match_args__ = ("pattern", "guard", "body")
pattern: _Pattern
guard: expr | None
body: list[stmt]
class MatchValue(pattern):
__match_args__ = ("value",)
value: expr
class MatchSingleton(pattern):
__match_args__ = ("value",)
value: Literal[True, False] | None
class MatchSequence(pattern):
__match_args__ = ("patterns",)
patterns: list[pattern]
class MatchStar(pattern):
__match_args__ = ("name",)
name: _Identifier | None
class MatchMapping(pattern):
__match_args__ = ("keys", "patterns", "rest")
keys: list[expr]
patterns: list[pattern]
rest: _Identifier | None
class MatchClass(pattern):
__match_args__ = ("cls", "patterns", "kwd_attrs", "kwd_patterns")
cls: expr
patterns: list[pattern]
kwd_attrs: list[_Identifier]
kwd_patterns: list[pattern]
class MatchAs(pattern):
__match_args__ = ("pattern", "name")
pattern: _Pattern | None
name: _Identifier | None
class MatchOr(pattern):
__match_args__ = ("patterns",)
patterns: list[pattern]
if sys.version_info >= (3, 12):
class type_param(AST):
end_lineno: int
end_col_offset: int
class TypeVar(type_param):
__match_args__ = ("name", "bound")
name: _Identifier
bound: expr | None
class ParamSpec(type_param):
__match_args__ = ("name",)
name: _Identifier
class TypeVarTuple(type_param):
__match_args__ = ("name",)
name: _Identifier
class TypeAlias(stmt):
__match_args__ = ("name", "type_params", "value")
name: Name
type_params: list[type_param]
value: expr

View File

@@ -1,20 +0,0 @@
def make_archive(
base_name: str,
format: str,
root_dir: str | None = None,
base_dir: str | None = None,
verbose: int = 0,
dry_run: int = 0,
owner: str | None = None,
group: str | None = None,
) -> str: ...
def make_tarball(
base_name: str,
base_dir: str,
compress: str | None = "gzip",
verbose: int = 0,
dry_run: int = 0,
owner: str | None = None,
group: str | None = None,
) -> str: ...
def make_zipfile(base_name: str, base_dir: str, verbose: int = 0, dry_run: int = 0) -> str: ...

View File

@@ -1,66 +0,0 @@
from _typeshed import Incomplete
from abc import abstractmethod
from collections.abc import Callable, Iterable
from distutils.dist import Distribution
from typing import Any
class Command:
distribution: Distribution
sub_commands: list[tuple[str, Callable[[Command], bool] | None]]
def __init__(self, dist: Distribution) -> None: ...
@abstractmethod
def initialize_options(self) -> None: ...
@abstractmethod
def finalize_options(self) -> None: ...
@abstractmethod
def run(self) -> None: ...
def announce(self, msg: str, level: int = 1) -> None: ...
def debug_print(self, msg: str) -> None: ...
def ensure_string(self, option: str, default: str | None = None) -> None: ...
def ensure_string_list(self, option: str | list[str]) -> None: ...
def ensure_filename(self, option: str) -> None: ...
def ensure_dirname(self, option: str) -> None: ...
def get_command_name(self) -> str: ...
def set_undefined_options(self, src_cmd: str, *option_pairs: tuple[str, str]) -> None: ...
def get_finalized_command(self, command: str, create: int = 1) -> Command: ...
def reinitialize_command(self, command: Command | str, reinit_subcommands: int = 0) -> Command: ...
def run_command(self, command: str) -> None: ...
def get_sub_commands(self) -> list[str]: ...
def warn(self, msg: str) -> None: ...
def execute(self, func: Callable[..., object], args: Iterable[Any], msg: str | None = None, level: int = 1) -> None: ...
def mkpath(self, name: str, mode: int = 0o777) -> None: ...
def copy_file(
self, infile: str, outfile: str, preserve_mode: int = 1, preserve_times: int = 1, link: str | None = None, level: Any = 1
) -> tuple[str, bool]: ... # level is not used
def copy_tree(
self,
infile: str,
outfile: str,
preserve_mode: int = 1,
preserve_times: int = 1,
preserve_symlinks: int = 0,
level: Any = 1,
) -> list[str]: ... # level is not used
def move_file(self, src: str, dst: str, level: Any = 1) -> str: ... # level is not used
def spawn(self, cmd: Iterable[str], search_path: int = 1, level: Any = 1) -> None: ... # level is not used
def make_archive(
self,
base_name: str,
format: str,
root_dir: str | None = None,
base_dir: str | None = None,
owner: str | None = None,
group: str | None = None,
) -> str: ...
def make_file(
self,
infiles: str | list[str] | tuple[str, ...],
outfile: str,
func: Callable[..., object],
args: list[Any],
exec_msg: str | None = None,
skip_msg: str | None = None,
level: Any = 1,
) -> None: ... # level is not used
def ensure_finalized(self) -> None: ...
def dump_options(self, header: Incomplete | None = None, indent: str = "") -> None: ...

View File

@@ -1,3 +0,0 @@
def newer(source: str, target: str) -> bool: ...
def newer_pairwise(sources: list[str], targets: list[str]) -> list[tuple[str, str]]: ...
def newer_group(sources: list[str], target: str, missing: str = "error") -> bool: ...

View File

@@ -1,13 +0,0 @@
def mkpath(name: str, mode: int = 0o777, verbose: int = 1, dry_run: int = 0) -> list[str]: ...
def create_tree(base_dir: str, files: list[str], mode: int = 0o777, verbose: int = 1, dry_run: int = 0) -> None: ...
def copy_tree(
src: str,
dst: str,
preserve_mode: int = 1,
preserve_times: int = 1,
preserve_symlinks: int = 0,
update: int = 0,
verbose: int = 1,
dry_run: int = 0,
) -> list[str]: ...
def remove_tree(directory: str, verbose: int = 1, dry_run: int = 0) -> None: ...

View File

@@ -1,14 +0,0 @@
from collections.abc import Sequence
def copy_file(
src: str,
dst: str,
preserve_mode: bool = ...,
preserve_times: bool = ...,
update: bool = ...,
link: str | None = None,
verbose: bool = ...,
dry_run: bool = ...,
) -> tuple[str, str]: ...
def move_file(src: str, dst: str, verbose: bool = ..., dry_run: bool = ...) -> str: ...
def write_file(filename: str, contents: Sequence[str]) -> None: ...

View File

@@ -1,2 +0,0 @@
def spawn(cmd: list[str], search_path: bool = ..., verbose: bool = ..., dry_run: bool = ...) -> None: ...
def find_executable(executable: str, path: str | None = None) -> str | None: ...

View File

@@ -1,33 +0,0 @@
import builtins
import types
from _typeshed import ReadableBuffer, SupportsRead, SupportsWrite
from typing import Any
from typing_extensions import TypeAlias
version: int
_Marshallable: TypeAlias = (
# handled in w_object() in marshal.c
None
| type[StopIteration]
| builtins.ellipsis
| bool
# handled in w_complex_object() in marshal.c
| int
| float
| complex
| bytes
| str
| tuple[_Marshallable, ...]
| list[Any]
| dict[Any, Any]
| set[Any]
| frozenset[_Marshallable]
| types.CodeType
| ReadableBuffer
)
def dump(value: _Marshallable, file: SupportsWrite[bytes], version: int = 4, /) -> None: ...
def load(file: SupportsRead[bytes], /) -> Any: ...
def dumps(value: _Marshallable, version: int = 4, /) -> bytes: ...
def loads(bytes: ReadableBuffer, /) -> Any: ...

View File

@@ -1 +0,0 @@
from _stat import *

View File

@@ -0,0 +1,35 @@
[package]
name = "red_knot_module_resolver"
version = "0.0.0"
publish = false
authors = { workspace = true }
edition = { workspace = true }
rust-version = { workspace = true }
homepage = { workspace = true }
documentation = { workspace = true }
repository = { workspace = true }
license = { workspace = true }
[dependencies]
ruff_db = { workspace = true }
ruff_python_stdlib = { workspace = true }
rustc-hash = { workspace = true }
salsa = { workspace = true }
smol_str = { workspace = true }
tracing = { workspace = true }
zip = { workspace = true }
[build-dependencies]
path-slash = { workspace = true }
walkdir = { workspace = true }
zip = { workspace = true }
[dev-dependencies]
anyhow = { workspace = true }
insta = { workspace = true }
tempfile = { workspace = true }
walkdir = { workspace = true }
[lints]
workspace = true

View File

@@ -0,0 +1,9 @@
# Red Knot
A work-in-progress multifile module resolver for Ruff.
## Vendored types for the stdlib
This crate vendors [typeshed](https://github.com/python/typeshed)'s stubs for the standard library. The vendored stubs can be found in `crates/red_knot_module_resolver/vendor/typeshed`. The file `crates/red_knot_module_resolver/vendor/typeshed/source_commit.txt` tells you the typeshed commit that our vendored stdlib stubs currently correspond to.
The typeshed stubs are updated every two weeks via an automated PR using the `sync_typeshed.yaml` workflow in the `.github/workflows` directory. This workflow can also be triggered at any time via [workflow dispatch](https://docs.github.com/en/actions/using-workflows/manually-running-a-workflow#running-a-workflow).

View File

@@ -0,0 +1,74 @@
//! Build script to package our vendored typeshed files
//! into a zip archive that can be included in the Ruff binary.
//!
//! This script should be automatically run at build time
//! whenever the script itself changes, or whenever any files
//! in `crates/red_knot_module_resolver/vendor/typeshed` change.
use std::fs::File;
use std::path::Path;
use path_slash::PathExt;
use zip::result::ZipResult;
use zip::write::{FileOptions, ZipWriter};
use zip::CompressionMethod;
const TYPESHED_SOURCE_DIR: &str = "vendor/typeshed";
const TYPESHED_ZIP_LOCATION: &str = "/zipped_typeshed.zip";
/// Recursively zip the contents of an entire directory.
///
/// This routine is adapted from a recipe at
/// <https://github.com/zip-rs/zip-old/blob/5d0f198124946b7be4e5969719a7f29f363118cd/examples/write_dir.rs>
fn zip_dir(directory_path: &str, writer: File) -> ZipResult<File> {
let mut zip = ZipWriter::new(writer);
let options = FileOptions::default()
.compression_method(CompressionMethod::Zstd)
.unix_permissions(0o644);
for entry in walkdir::WalkDir::new(directory_path) {
let dir_entry = entry.unwrap();
let absolute_path = dir_entry.path();
let normalized_relative_path = absolute_path
.strip_prefix(Path::new(directory_path))
.unwrap()
.to_slash()
.expect("Unexpected non-utf8 typeshed path!");
// Write file or directory explicitly
// Some unzip tools unzip files with directory paths correctly, some do not!
if absolute_path.is_file() {
println!("adding file {absolute_path:?} as {normalized_relative_path:?} ...");
zip.start_file(normalized_relative_path, options)?;
let mut f = File::open(absolute_path)?;
std::io::copy(&mut f, &mut zip).unwrap();
} else if !normalized_relative_path.is_empty() {
// Only if not root! Avoids path spec / warning
// and mapname conversion failed error on unzip
println!("adding dir {absolute_path:?} as {normalized_relative_path:?} ...");
zip.add_directory(normalized_relative_path, options)?;
}
}
zip.finish()
}
fn main() {
println!("cargo:rerun-if-changed={TYPESHED_SOURCE_DIR}");
assert!(
Path::new(TYPESHED_SOURCE_DIR).is_dir(),
"Where is typeshed?"
);
let out_dir = std::env::var("OUT_DIR").unwrap();
// N.B. Deliberately using `format!()` instead of `Path::join()` here,
// so that we use `/` as a path separator on all platforms.
// That enables us to load the typeshed zip at compile time in `module.rs`
// (otherwise we'd have to dynamically determine the exact path to the typeshed zip
// based on the default path separator for the specific platform we're on,
// which can't be done at compile time.)
let zipped_typeshed_location = format!("{out_dir}{TYPESHED_ZIP_LOCATION}");
let zipped_typeshed = File::create(zipped_typeshed_location).unwrap();
zip_dir(TYPESHED_SOURCE_DIR, zipped_typeshed).unwrap();
}

View File

@@ -0,0 +1,156 @@
use ruff_db::Upcast;
use crate::resolver::{
file_to_module,
internal::{ModuleNameIngredient, ModuleResolverSearchPaths},
resolve_module_query,
};
#[salsa::jar(db=Db)]
pub struct Jar(
ModuleNameIngredient<'_>,
ModuleResolverSearchPaths,
resolve_module_query,
file_to_module,
);
pub trait Db: salsa::DbWithJar<Jar> + ruff_db::Db + Upcast<dyn ruff_db::Db> {}
pub(crate) mod tests {
use std::sync;
use salsa::DebugWithDb;
use ruff_db::file_system::{FileSystem, MemoryFileSystem, OsFileSystem};
use ruff_db::vfs::Vfs;
use super::*;
#[salsa::db(Jar, ruff_db::Jar)]
pub(crate) struct TestDb {
storage: salsa::Storage<Self>,
file_system: TestFileSystem,
events: sync::Arc<sync::Mutex<Vec<salsa::Event>>>,
vfs: Vfs,
}
impl TestDb {
#[allow(unused)]
pub(crate) fn new() -> Self {
Self {
storage: salsa::Storage::default(),
file_system: TestFileSystem::Memory(MemoryFileSystem::default()),
events: sync::Arc::default(),
vfs: Vfs::with_stubbed_vendored(),
}
}
/// Returns the memory file system.
///
/// ## Panics
/// If this test db isn't using a memory file system.
#[allow(unused)]
pub(crate) fn memory_file_system(&self) -> &MemoryFileSystem {
if let TestFileSystem::Memory(fs) = &self.file_system {
fs
} else {
panic!("The test db is not using a memory file system");
}
}
/// Uses the real file system instead of the memory file system.
///
/// This useful for testing advanced file system features like permissions, symlinks, etc.
///
/// Note that any files written to the memory file system won't be copied over.
#[allow(unused)]
pub(crate) fn with_os_file_system(&mut self) {
self.file_system = TestFileSystem::Os(OsFileSystem);
}
#[allow(unused)]
pub(crate) fn vfs_mut(&mut self) -> &mut Vfs {
&mut self.vfs
}
/// Takes the salsa events.
///
/// ## Panics
/// If there are any pending salsa snapshots.
#[allow(unused)]
pub(crate) fn take_salsa_events(&mut self) -> Vec<salsa::Event> {
let inner = sync::Arc::get_mut(&mut self.events).expect("no pending salsa snapshots");
let events = inner.get_mut().unwrap();
std::mem::take(&mut *events)
}
/// Clears the salsa events.
///
/// ## Panics
/// If there are any pending salsa snapshots.
#[allow(unused)]
pub(crate) fn clear_salsa_events(&mut self) {
self.take_salsa_events();
}
}
impl Upcast<dyn ruff_db::Db> for TestDb {
fn upcast(&self) -> &(dyn ruff_db::Db + 'static) {
self
}
}
impl ruff_db::Db for TestDb {
fn file_system(&self) -> &dyn ruff_db::file_system::FileSystem {
self.file_system.inner()
}
fn vfs(&self) -> &ruff_db::vfs::Vfs {
&self.vfs
}
}
impl Db for TestDb {}
impl salsa::Database for TestDb {
fn salsa_event(&self, event: salsa::Event) {
tracing::trace!("event: {:?}", event.debug(self));
let mut events = self.events.lock().unwrap();
events.push(event);
}
}
impl salsa::ParallelDatabase for TestDb {
fn snapshot(&self) -> salsa::Snapshot<Self> {
salsa::Snapshot::new(Self {
storage: self.storage.snapshot(),
file_system: self.file_system.snapshot(),
events: self.events.clone(),
vfs: self.vfs.snapshot(),
})
}
}
enum TestFileSystem {
Memory(MemoryFileSystem),
#[allow(unused)]
Os(OsFileSystem),
}
impl TestFileSystem {
fn inner(&self) -> &dyn FileSystem {
match self {
Self::Memory(inner) => inner,
Self::Os(inner) => inner,
}
}
fn snapshot(&self) -> Self {
match self {
Self::Memory(inner) => Self::Memory(inner.snapshot()),
Self::Os(inner) => Self::Os(inner.snapshot()),
}
}
}
}

View File

@@ -0,0 +1,9 @@
mod db;
mod module;
mod resolver;
mod typeshed;
pub use db::{Db, Jar};
pub use module::{ModuleKind, ModuleName};
pub use resolver::{resolve_module, set_module_resolution_settings, ModuleResolutionSettings};
pub use typeshed::versions::TypeshedVersions;

View File

@@ -0,0 +1,346 @@
use std::fmt::Formatter;
use std::ops::Deref;
use std::sync::Arc;
use ruff_db::file_system::FileSystemPath;
use ruff_db::vfs::{VfsFile, VfsPath};
use ruff_python_stdlib::identifiers::is_identifier;
use crate::Db;
/// A module name, e.g. `foo.bar`.
///
/// Always normalized to the absolute form (never a relative module name, i.e., never `.foo`).
#[derive(Clone, Debug, Eq, PartialEq, Hash, PartialOrd, Ord)]
pub struct ModuleName(smol_str::SmolStr);
impl ModuleName {
/// Creates a new module name for `name`. Returns `Some` if `name` is a valid, absolute
/// module name and `None` otherwise.
///
/// The module name is invalid if:
///
/// * The name is empty
/// * The name is relative
/// * The name ends with a `.`
/// * The name contains a sequence of multiple dots
/// * A component of a name (the part between two dots) isn't a valid python identifier.
#[inline]
pub fn new(name: &str) -> Option<Self> {
Self::new_from_smol(smol_str::SmolStr::new(name))
}
/// Creates a new module name for `name` where `name` is a static string.
/// Returns `Some` if `name` is a valid, absolute module name and `None` otherwise.
///
/// The module name is invalid if:
///
/// * The name is empty
/// * The name is relative
/// * The name ends with a `.`
/// * The name contains a sequence of multiple dots
/// * A component of a name (the part between two dots) isn't a valid python identifier.
///
/// ## Examples
///
/// ```
/// use red_knot_module_resolver::ModuleName;
///
/// assert_eq!(ModuleName::new_static("foo.bar").as_deref(), Some("foo.bar"));
/// assert_eq!(ModuleName::new_static(""), None);
/// assert_eq!(ModuleName::new_static("..foo"), None);
/// assert_eq!(ModuleName::new_static(".foo"), None);
/// assert_eq!(ModuleName::new_static("foo."), None);
/// assert_eq!(ModuleName::new_static("foo..bar"), None);
/// assert_eq!(ModuleName::new_static("2000"), None);
/// ```
#[inline]
pub fn new_static(name: &'static str) -> Option<Self> {
Self::new_from_smol(smol_str::SmolStr::new_static(name))
}
fn new_from_smol(name: smol_str::SmolStr) -> Option<Self> {
if name.is_empty() {
return None;
}
if name.split('.').all(is_identifier) {
Some(Self(name))
} else {
None
}
}
/// An iterator over the components of the module name:
///
/// # Examples
///
/// ```
/// use red_knot_module_resolver::ModuleName;
///
/// assert_eq!(ModuleName::new_static("foo.bar.baz").unwrap().components().collect::<Vec<_>>(), vec!["foo", "bar", "baz"]);
/// ```
pub fn components(&self) -> impl DoubleEndedIterator<Item = &str> {
self.0.split('.')
}
/// The name of this module's immediate parent, if it has a parent.
///
/// # Examples
///
/// ```
/// use red_knot_module_resolver::ModuleName;
///
/// assert_eq!(ModuleName::new_static("foo.bar").unwrap().parent(), Some(ModuleName::new_static("foo").unwrap()));
/// assert_eq!(ModuleName::new_static("foo.bar.baz").unwrap().parent(), Some(ModuleName::new_static("foo.bar").unwrap()));
/// assert_eq!(ModuleName::new_static("root").unwrap().parent(), None);
/// ```
pub fn parent(&self) -> Option<ModuleName> {
let (parent, _) = self.0.rsplit_once('.')?;
Some(Self(smol_str::SmolStr::new(parent)))
}
/// Returns `true` if the name starts with `other`.
///
/// This is equivalent to checking if `self` is a sub-module of `other`.
///
/// # Examples
///
/// ```
/// use red_knot_module_resolver::ModuleName;
///
/// assert!(ModuleName::new_static("foo.bar").unwrap().starts_with(&ModuleName::new_static("foo").unwrap()));
///
/// assert!(!ModuleName::new_static("foo.bar").unwrap().starts_with(&ModuleName::new_static("bar").unwrap()));
/// assert!(!ModuleName::new_static("foo_bar").unwrap().starts_with(&ModuleName::new_static("foo").unwrap()));
/// ```
pub fn starts_with(&self, other: &ModuleName) -> bool {
let mut self_components = self.components();
let other_components = other.components();
for other_component in other_components {
if self_components.next() != Some(other_component) {
return false;
}
}
true
}
#[inline]
pub fn as_str(&self) -> &str {
&self.0
}
pub(crate) fn from_relative_path(path: &FileSystemPath) -> Option<Self> {
let path = if path.ends_with("__init__.py") || path.ends_with("__init__.pyi") {
path.parent()?
} else {
path
};
let name = if let Some(parent) = path.parent() {
let mut name = String::with_capacity(path.as_str().len());
for component in parent.components() {
name.push_str(component.as_os_str().to_str()?);
name.push('.');
}
// SAFETY: Unwrap is safe here or `parent` would have returned `None`.
name.push_str(path.file_stem().unwrap());
smol_str::SmolStr::from(name)
} else {
smol_str::SmolStr::new(path.file_stem()?)
};
Some(Self(name))
}
}
impl Deref for ModuleName {
type Target = str;
#[inline]
fn deref(&self) -> &Self::Target {
self.as_str()
}
}
impl PartialEq<str> for ModuleName {
fn eq(&self, other: &str) -> bool {
self.as_str() == other
}
}
impl PartialEq<ModuleName> for str {
fn eq(&self, other: &ModuleName) -> bool {
self == other.as_str()
}
}
impl std::fmt::Display for ModuleName {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.write_str(&self.0)
}
}
/// Representation of a Python module.
#[derive(Clone, PartialEq, Eq)]
pub struct Module {
inner: Arc<ModuleInner>,
}
impl Module {
pub(crate) fn new(
name: ModuleName,
kind: ModuleKind,
search_path: ModuleSearchPath,
file: VfsFile,
) -> Self {
Self {
inner: Arc::new(ModuleInner {
name,
kind,
search_path,
file,
}),
}
}
/// The absolute name of the module (e.g. `foo.bar`)
pub fn name(&self) -> &ModuleName {
&self.inner.name
}
/// The file to the source code that defines this module
pub fn file(&self) -> VfsFile {
self.inner.file
}
/// The search path from which the module was resolved.
pub fn search_path(&self) -> &ModuleSearchPath {
&self.inner.search_path
}
/// Determine whether this module is a single-file module or a package
pub fn kind(&self) -> ModuleKind {
self.inner.kind
}
}
impl std::fmt::Debug for Module {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Module")
.field("name", &self.name())
.field("kind", &self.kind())
.field("file", &self.file())
.field("search_path", &self.search_path())
.finish()
}
}
impl salsa::DebugWithDb<dyn Db> for Module {
fn fmt(&self, f: &mut Formatter<'_>, db: &dyn Db) -> std::fmt::Result {
f.debug_struct("Module")
.field("name", &self.name())
.field("kind", &self.kind())
.field("file", &self.file().debug(db.upcast()))
.field("search_path", &self.search_path())
.finish()
}
}
#[derive(PartialEq, Eq)]
struct ModuleInner {
name: ModuleName,
kind: ModuleKind,
search_path: ModuleSearchPath,
file: VfsFile,
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub enum ModuleKind {
/// A single-file module (e.g. `foo.py` or `foo.pyi`)
Module,
/// A python package (`foo/__init__.py` or `foo/__init__.pyi`)
Package,
}
/// A search path in which to search modules.
/// Corresponds to a path in [`sys.path`](https://docs.python.org/3/library/sys_path_init.html) at runtime.
///
/// Cloning a search path is cheap because it's an `Arc`.
#[derive(Clone, PartialEq, Eq)]
pub struct ModuleSearchPath {
inner: Arc<ModuleSearchPathInner>,
}
impl ModuleSearchPath {
pub fn new<P>(path: P, kind: ModuleSearchPathKind) -> Self
where
P: Into<VfsPath>,
{
Self {
inner: Arc::new(ModuleSearchPathInner {
path: path.into(),
kind,
}),
}
}
/// Determine whether this is a first-party, third-party or standard-library search path
pub fn kind(&self) -> ModuleSearchPathKind {
self.inner.kind
}
/// Return the location of the search path on the file system
pub fn path(&self) -> &VfsPath {
&self.inner.path
}
}
impl std::fmt::Debug for ModuleSearchPath {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("ModuleSearchPath")
.field("path", &self.inner.path)
.field("kind", &self.kind())
.finish()
}
}
#[derive(Eq, PartialEq)]
struct ModuleSearchPathInner {
path: VfsPath,
kind: ModuleSearchPathKind,
}
/// Enumeration of the different kinds of search paths type checkers are expected to support.
///
/// N.B. Although we don't implement `Ord` for this enum, they are ordered in terms of the
/// priority that we want to give these modules when resolving them.
/// This is roughly [the order given in the typing spec], but typeshed's stubs
/// for the standard library are moved higher up to match Python's semantics at runtime.
///
/// [the order given in the typing spec]: https://typing.readthedocs.io/en/latest/spec/distributing.html#import-resolution-ordering
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub enum ModuleSearchPathKind {
/// "Extra" paths provided by the user in a config file, env var or CLI flag.
/// E.g. mypy's `MYPYPATH` env var, or pyright's `stubPath` configuration setting
Extra,
/// Files in the project we're directly being invoked on
FirstParty,
/// The `stdlib` directory of typeshed (either vendored or custom)
StandardLibrary,
/// Stubs or runtime modules installed in site-packages
SitePackagesThirdParty,
/// Vendored third-party stubs from typeshed
VendoredThirdParty,
}

View File

@@ -0,0 +1,938 @@
use salsa::DebugWithDb;
use std::ops::Deref;
use ruff_db::file_system::{FileSystem, FileSystemPath, FileSystemPathBuf};
use ruff_db::vfs::{system_path_to_file, vfs_path_to_file, VfsFile, VfsPath};
use crate::module::{Module, ModuleKind, ModuleName, ModuleSearchPath, ModuleSearchPathKind};
use crate::resolver::internal::ModuleResolverSearchPaths;
use crate::Db;
const TYPESHED_STDLIB_DIRECTORY: &str = "stdlib";
/// Configures the module search paths for the module resolver.
///
/// Must be called before calling any other module resolution functions.
pub fn set_module_resolution_settings(db: &mut dyn Db, config: ModuleResolutionSettings) {
// There's no concurrency issue here because we hold a `&mut dyn Db` reference. No other
// thread can mutate the `Db` while we're in this call, so using `try_get` to test if
// the settings have already been set is safe.
if let Some(existing) = ModuleResolverSearchPaths::try_get(db) {
existing
.set_search_paths(db)
.to(config.into_ordered_search_paths());
} else {
ModuleResolverSearchPaths::new(db, config.into_ordered_search_paths());
}
}
/// Resolves a module name to a module.
pub fn resolve_module(db: &dyn Db, module_name: ModuleName) -> Option<Module> {
let interned_name = internal::ModuleNameIngredient::new(db, module_name);
resolve_module_query(db, interned_name)
}
/// Salsa query that resolves an interned [`ModuleNameIngredient`] to a module.
///
/// This query should not be called directly. Instead, use [`resolve_module`]. It only exists
/// because Salsa requires the module name to be an ingredient.
#[salsa::tracked]
pub(crate) fn resolve_module_query<'db>(
db: &'db dyn Db,
module_name: internal::ModuleNameIngredient<'db>,
) -> Option<Module> {
let _ = tracing::trace_span!("resolve_module", module_name = ?module_name.debug(db)).enter();
let name = module_name.name(db);
let (search_path, module_file, kind) = resolve_name(db, name)?;
let module = Module::new(name.clone(), kind, search_path, module_file);
Some(module)
}
/// Resolves the module for the given path.
///
/// Returns `None` if the path is not a module locatable via `sys.path`.
#[tracing::instrument(level = "debug", skip(db))]
pub fn path_to_module(db: &dyn Db, path: &VfsPath) -> Option<Module> {
// It's not entirely clear on first sight why this method calls `file_to_module` instead of
// it being the other way round, considering that the first thing that `file_to_module` does
// is to retrieve the file's path.
//
// The reason is that `file_to_module` is a tracked Salsa query and salsa queries require that
// all arguments are Salsa ingredients (something stored in Salsa). `Path`s aren't salsa ingredients but
// `VfsFile` is. So what we do here is to retrieve the `path`'s `VfsFile` so that we can make
// use of Salsa's caching and invalidation.
let file = vfs_path_to_file(db.upcast(), path)?;
file_to_module(db, file)
}
/// Resolves the module for the file with the given id.
///
/// Returns `None` if the file is not a module locatable via `sys.path`.
#[salsa::tracked]
#[allow(unused)]
pub(crate) fn file_to_module(db: &dyn Db, file: VfsFile) -> Option<Module> {
let _ = tracing::trace_span!("file_to_module", file = ?file.debug(db.upcast())).enter();
let path = file.path(db.upcast());
let search_paths = module_search_paths(db);
let relative_path = search_paths
.iter()
.find_map(|root| match (root.path(), path) {
(VfsPath::FileSystem(root_path), VfsPath::FileSystem(path)) => {
let relative_path = path.strip_prefix(root_path).ok()?;
Some(relative_path)
}
(VfsPath::Vendored(_), VfsPath::Vendored(_)) => {
todo!("Add support for vendored modules")
}
(VfsPath::Vendored(_), VfsPath::FileSystem(_))
| (VfsPath::FileSystem(_), VfsPath::Vendored(_)) => None,
})?;
let module_name = ModuleName::from_relative_path(relative_path)?;
// Resolve the module name to see if Python would resolve the name to the same path.
// If it doesn't, then that means that multiple modules have the same name in different
// root paths, but that the module corresponding to `path` is in a lower priority search path,
// in which case we ignore it.
let module = resolve_module(db, module_name)?;
if file == module.file() {
Some(module)
} else {
// This path is for a module with the same name but with a different precedence. For example:
// ```
// src/foo.py
// src/foo/__init__.py
// ```
// The module name of `src/foo.py` is `foo`, but the module loaded by Python is `src/foo/__init__.py`.
// That means we need to ignore `src/foo.py` even though it resolves to the same module name.
None
}
}
/// Configures the search paths that are used to resolve modules.
#[derive(Eq, PartialEq, Debug)]
pub struct ModuleResolutionSettings {
/// List of user-provided paths that should take first priority in the module resolution.
/// Examples in other type checkers are mypy's MYPYPATH environment variable,
/// or pyright's stubPath configuration setting.
pub extra_paths: Vec<FileSystemPathBuf>,
/// The root of the workspace, used for finding first-party modules.
pub workspace_root: FileSystemPathBuf,
/// The path to the user's `site-packages` directory, where third-party packages from ``PyPI`` are installed.
pub site_packages: Option<FileSystemPathBuf>,
/// Optional path to standard-library typeshed stubs.
/// Currently this has to be a directory that exists on disk.
///
/// (TODO: fall back to vendored stubs if no custom directory is provided.)
pub custom_typeshed: Option<FileSystemPathBuf>,
}
impl ModuleResolutionSettings {
/// Implementation of PEP 561's module resolution order
/// (with some small, deliberate, differences)
fn into_ordered_search_paths(self) -> OrderedSearchPaths {
let ModuleResolutionSettings {
extra_paths,
workspace_root,
site_packages,
custom_typeshed,
} = self;
let mut paths: Vec<_> = extra_paths
.into_iter()
.map(|path| ModuleSearchPath::new(path, ModuleSearchPathKind::Extra))
.collect();
paths.push(ModuleSearchPath::new(
workspace_root,
ModuleSearchPathKind::FirstParty,
));
// TODO fallback to vendored typeshed stubs if no custom typeshed directory is provided by the user
if let Some(custom_typeshed) = custom_typeshed {
paths.push(ModuleSearchPath::new(
custom_typeshed.join(TYPESHED_STDLIB_DIRECTORY),
ModuleSearchPathKind::StandardLibrary,
));
}
// TODO vendor typeshed's third-party stubs as well as the stdlib and fallback to them as a final step
if let Some(site_packages) = site_packages {
paths.push(ModuleSearchPath::new(
site_packages,
ModuleSearchPathKind::SitePackagesThirdParty,
));
}
OrderedSearchPaths(paths)
}
}
/// A resolved module resolution order, implementing PEP 561
/// (with some small, deliberate differences)
#[derive(Clone, Debug, Default, Eq, PartialEq)]
pub(crate) struct OrderedSearchPaths(Vec<ModuleSearchPath>);
impl Deref for OrderedSearchPaths {
type Target = [ModuleSearchPath];
fn deref(&self) -> &Self::Target {
&self.0
}
}
// The singleton methods generated by salsa are all `pub` instead of `pub(crate)` which triggers
// `unreachable_pub`. Work around this by creating a module and allow `unreachable_pub` for it.
// Salsa also generates uses to `_db` variables for `interned` which triggers `clippy::used_underscore_binding`. Suppress that too
// TODO(micha): Contribute a fix for this upstream where the singleton methods have the same visibility as the struct.
#[allow(unreachable_pub, clippy::used_underscore_binding)]
pub(crate) mod internal {
use crate::module::ModuleName;
use crate::resolver::OrderedSearchPaths;
#[salsa::input(singleton)]
pub(crate) struct ModuleResolverSearchPaths {
#[return_ref]
pub(super) search_paths: OrderedSearchPaths,
}
/// A thin wrapper around `ModuleName` to make it a Salsa ingredient.
///
/// This is needed because Salsa requires that all query arguments are salsa ingredients.
#[salsa::interned]
pub(crate) struct ModuleNameIngredient<'db> {
#[return_ref]
pub(super) name: ModuleName,
}
}
fn module_search_paths(db: &dyn Db) -> &[ModuleSearchPath] {
ModuleResolverSearchPaths::get(db).search_paths(db)
}
/// Given a module name and a list of search paths in which to lookup modules,
/// attempt to resolve the module name
fn resolve_name(db: &dyn Db, name: &ModuleName) -> Option<(ModuleSearchPath, VfsFile, ModuleKind)> {
let search_paths = module_search_paths(db);
for search_path in search_paths {
let mut components = name.components();
let module_name = components.next_back()?;
let VfsPath::FileSystem(fs_search_path) = search_path.path() else {
todo!("Vendored search paths are not yet supported");
};
match resolve_package(db.file_system(), fs_search_path, components) {
Ok(resolved_package) => {
let mut package_path = resolved_package.path;
package_path.push(module_name);
// Must be a `__init__.pyi` or `__init__.py` or it isn't a package.
let kind = if db.file_system().is_directory(&package_path) {
package_path.push("__init__");
ModuleKind::Package
} else {
ModuleKind::Module
};
// TODO Implement full https://peps.python.org/pep-0561/#type-checker-module-resolution-order resolution
let stub = package_path.with_extension("pyi");
if let Some(stub) = system_path_to_file(db.upcast(), &stub) {
return Some((search_path.clone(), stub, kind));
}
let module = package_path.with_extension("py");
if let Some(module) = system_path_to_file(db.upcast(), &module) {
return Some((search_path.clone(), module, kind));
}
// For regular packages, don't search the next search path. All files of that
// package must be in the same location
if resolved_package.kind.is_regular_package() {
return None;
}
}
Err(parent_kind) => {
if parent_kind.is_regular_package() {
// For regular packages, don't search the next search path.
return None;
}
}
}
}
None
}
fn resolve_package<'a, I>(
fs: &dyn FileSystem,
module_search_path: &FileSystemPath,
components: I,
) -> Result<ResolvedPackage, PackageKind>
where
I: Iterator<Item = &'a str>,
{
let mut package_path = module_search_path.to_path_buf();
// `true` if inside a folder that is a namespace package (has no `__init__.py`).
// Namespace packages are special because they can be spread across multiple search paths.
// https://peps.python.org/pep-0420/
let mut in_namespace_package = false;
// `true` if resolving a sub-package. For example, `true` when resolving `bar` of `foo.bar`.
let mut in_sub_package = false;
// For `foo.bar.baz`, test that `foo` and `baz` both contain a `__init__.py`.
for folder in components {
package_path.push(folder);
let has_init_py = fs.is_file(&package_path.join("__init__.py"))
|| fs.is_file(&package_path.join("__init__.pyi"));
if has_init_py {
in_namespace_package = false;
} else if fs.is_directory(&package_path) {
// A directory without an `__init__.py` is a namespace package, continue with the next folder.
in_namespace_package = true;
} else if in_namespace_package {
// Package not found but it is part of a namespace package.
return Err(PackageKind::Namespace);
} else if in_sub_package {
// A regular sub package wasn't found.
return Err(PackageKind::Regular);
} else {
// We couldn't find `foo` for `foo.bar.baz`, search the next search path.
return Err(PackageKind::Root);
}
in_sub_package = true;
}
let kind = if in_namespace_package {
PackageKind::Namespace
} else if in_sub_package {
PackageKind::Regular
} else {
PackageKind::Root
};
Ok(ResolvedPackage {
kind,
path: package_path,
})
}
#[derive(Debug)]
struct ResolvedPackage {
path: FileSystemPathBuf,
kind: PackageKind,
}
#[derive(Copy, Clone, Eq, PartialEq, Debug)]
enum PackageKind {
/// A root package or module. E.g. `foo` in `foo.bar.baz` or just `foo`.
Root,
/// A regular sub-package where the parent contains an `__init__.py`.
///
/// For example, `bar` in `foo.bar` when the `foo` directory contains an `__init__.py`.
Regular,
/// A sub-package in a namespace package. A namespace package is a package without an `__init__.py`.
///
/// For example, `bar` in `foo.bar` if the `foo` directory contains no `__init__.py`.
Namespace,
}
impl PackageKind {
const fn is_regular_package(self) -> bool {
matches!(self, PackageKind::Regular)
}
}
#[cfg(test)]
mod tests {
use ruff_db::file_system::{FileSystemPath, FileSystemPathBuf};
use ruff_db::vfs::{system_path_to_file, VfsFile, VfsPath};
use crate::db::tests::TestDb;
use crate::module::{ModuleKind, ModuleName};
use super::{
path_to_module, resolve_module, set_module_resolution_settings, ModuleResolutionSettings,
TYPESHED_STDLIB_DIRECTORY,
};
struct TestCase {
db: TestDb,
src: FileSystemPathBuf,
custom_typeshed: FileSystemPathBuf,
site_packages: FileSystemPathBuf,
}
fn create_resolver() -> std::io::Result<TestCase> {
let mut db = TestDb::new();
let src = FileSystemPath::new("src").to_path_buf();
let site_packages = FileSystemPath::new("site_packages").to_path_buf();
let custom_typeshed = FileSystemPath::new("typeshed").to_path_buf();
let fs = db.memory_file_system();
fs.create_directory_all(&src)?;
fs.create_directory_all(&site_packages)?;
fs.create_directory_all(&custom_typeshed)?;
let settings = ModuleResolutionSettings {
extra_paths: vec![],
workspace_root: src.clone(),
site_packages: Some(site_packages.clone()),
custom_typeshed: Some(custom_typeshed.clone()),
};
set_module_resolution_settings(&mut db, settings);
Ok(TestCase {
db,
src,
custom_typeshed,
site_packages,
})
}
#[test]
fn first_party_module() -> anyhow::Result<()> {
let TestCase { db, src, .. } = create_resolver()?;
let foo_module_name = ModuleName::new_static("foo").unwrap();
let foo_path = src.join("foo.py");
db.memory_file_system()
.write_file(&foo_path, "print('Hello, world!')")?;
let foo_module = resolve_module(&db, foo_module_name.clone()).unwrap();
assert_eq!(
Some(&foo_module),
resolve_module(&db, foo_module_name.clone()).as_ref()
);
assert_eq!("foo", foo_module.name());
assert_eq!(&src, foo_module.search_path().path());
assert_eq!(ModuleKind::Module, foo_module.kind());
assert_eq!(&foo_path, foo_module.file().path(&db));
assert_eq!(
Some(foo_module),
path_to_module(&db, &VfsPath::FileSystem(foo_path))
);
Ok(())
}
#[test]
fn stdlib() -> anyhow::Result<()> {
let TestCase {
db,
custom_typeshed,
..
} = create_resolver()?;
let stdlib_dir = custom_typeshed.join(TYPESHED_STDLIB_DIRECTORY);
let functools_path = stdlib_dir.join("functools.py");
db.memory_file_system()
.write_file(&functools_path, "def update_wrapper(): ...")?;
let functools_module_name = ModuleName::new_static("functools").unwrap();
let functools_module = resolve_module(&db, functools_module_name.clone()).unwrap();
assert_eq!(
Some(&functools_module),
resolve_module(&db, functools_module_name).as_ref()
);
assert_eq!(&stdlib_dir, functools_module.search_path().path());
assert_eq!(ModuleKind::Module, functools_module.kind());
assert_eq!(&functools_path.clone(), functools_module.file().path(&db));
assert_eq!(
Some(functools_module),
path_to_module(&db, &VfsPath::FileSystem(functools_path))
);
Ok(())
}
#[test]
fn first_party_precedence_over_stdlib() -> anyhow::Result<()> {
let TestCase {
db,
src,
custom_typeshed,
..
} = create_resolver()?;
let stdlib_dir = custom_typeshed.join(TYPESHED_STDLIB_DIRECTORY);
let stdlib_functools_path = stdlib_dir.join("functools.py");
let first_party_functools_path = src.join("functools.py");
db.memory_file_system().write_files([
(&stdlib_functools_path, "def update_wrapper(): ..."),
(&first_party_functools_path, "def update_wrapper(): ..."),
])?;
let functools_module_name = ModuleName::new_static("functools").unwrap();
let functools_module = resolve_module(&db, functools_module_name.clone()).unwrap();
assert_eq!(
Some(&functools_module),
resolve_module(&db, functools_module_name).as_ref()
);
assert_eq!(&src, functools_module.search_path().path());
assert_eq!(ModuleKind::Module, functools_module.kind());
assert_eq!(
&first_party_functools_path.clone(),
functools_module.file().path(&db)
);
assert_eq!(
Some(functools_module),
path_to_module(&db, &VfsPath::FileSystem(first_party_functools_path))
);
Ok(())
}
// TODO: Port typeshed test case. Porting isn't possible at the moment because the vendored zip
// is part of the red knot crate
// #[test]
// fn typeshed_zip_created_at_build_time() -> anyhow::Result<()> {
// // The file path here is hardcoded in this crate's `build.rs` script.
// // Luckily this crate will fail to build if this file isn't available at build time.
// const TYPESHED_ZIP_BYTES: &[u8] =
// include_bytes!(concat!(env!("OUT_DIR"), "/zipped_typeshed.zip"));
// assert!(!TYPESHED_ZIP_BYTES.is_empty());
// let mut typeshed_zip_archive = ZipArchive::new(Cursor::new(TYPESHED_ZIP_BYTES))?;
//
// let path_to_functools = Path::new("stdlib").join("functools.pyi");
// let mut functools_module_stub = typeshed_zip_archive
// .by_name(path_to_functools.to_str().unwrap())
// .unwrap();
// assert!(functools_module_stub.is_file());
//
// let mut functools_module_stub_source = String::new();
// functools_module_stub.read_to_string(&mut functools_module_stub_source)?;
//
// assert!(functools_module_stub_source.contains("def update_wrapper("));
// Ok(())
// }
#[test]
fn resolve_package() -> anyhow::Result<()> {
let TestCase { src, db, .. } = create_resolver()?;
let foo_dir = src.join("foo");
let foo_path = foo_dir.join("__init__.py");
db.memory_file_system()
.write_file(&foo_path, "print('Hello, world!')")?;
let foo_module = resolve_module(&db, ModuleName::new_static("foo").unwrap()).unwrap();
assert_eq!("foo", foo_module.name());
assert_eq!(&src, foo_module.search_path().path());
assert_eq!(&foo_path, foo_module.file().path(&db));
assert_eq!(
Some(&foo_module),
path_to_module(&db, &VfsPath::FileSystem(foo_path)).as_ref()
);
// Resolving by directory doesn't resolve to the init file.
assert_eq!(None, path_to_module(&db, &VfsPath::FileSystem(foo_dir)));
Ok(())
}
#[test]
fn package_priority_over_module() -> anyhow::Result<()> {
let TestCase { db, src, .. } = create_resolver()?;
let foo_dir = src.join("foo");
let foo_init = foo_dir.join("__init__.py");
db.memory_file_system()
.write_file(&foo_init, "print('Hello, world!')")?;
let foo_py = src.join("foo.py");
db.memory_file_system()
.write_file(&foo_py, "print('Hello, world!')")?;
let foo_module = resolve_module(&db, ModuleName::new_static("foo").unwrap()).unwrap();
assert_eq!(&src, foo_module.search_path().path());
assert_eq!(&foo_init, foo_module.file().path(&db));
assert_eq!(ModuleKind::Package, foo_module.kind());
assert_eq!(
Some(foo_module),
path_to_module(&db, &VfsPath::FileSystem(foo_init))
);
assert_eq!(None, path_to_module(&db, &VfsPath::FileSystem(foo_py)));
Ok(())
}
#[test]
fn typing_stub_over_module() -> anyhow::Result<()> {
let TestCase { db, src, .. } = create_resolver()?;
let foo_stub = src.join("foo.pyi");
let foo_py = src.join("foo.py");
db.memory_file_system()
.write_files([(&foo_stub, "x: int"), (&foo_py, "print('Hello, world!')")])?;
let foo = resolve_module(&db, ModuleName::new_static("foo").unwrap()).unwrap();
assert_eq!(&src, foo.search_path().path());
assert_eq!(&foo_stub, foo.file().path(&db));
assert_eq!(
Some(foo),
path_to_module(&db, &VfsPath::FileSystem(foo_stub))
);
assert_eq!(None, path_to_module(&db, &VfsPath::FileSystem(foo_py)));
Ok(())
}
#[test]
fn sub_packages() -> anyhow::Result<()> {
let TestCase { db, src, .. } = create_resolver()?;
let foo = src.join("foo");
let bar = foo.join("bar");
let baz = bar.join("baz.py");
db.memory_file_system().write_files([
(&foo.join("__init__.py"), ""),
(&bar.join("__init__.py"), ""),
(&baz, "print('Hello, world!')"),
])?;
let baz_module =
resolve_module(&db, ModuleName::new_static("foo.bar.baz").unwrap()).unwrap();
assert_eq!(&src, baz_module.search_path().path());
assert_eq!(&baz, baz_module.file().path(&db));
assert_eq!(
Some(baz_module),
path_to_module(&db, &VfsPath::FileSystem(baz))
);
Ok(())
}
#[test]
fn namespace_package() -> anyhow::Result<()> {
let TestCase {
db,
src,
site_packages,
..
} = create_resolver()?;
// From [PEP420](https://peps.python.org/pep-0420/#nested-namespace-packages).
// But uses `src` for `project1` and `site_packages2` for `project2`.
// ```
// src
// parent
// child
// one.py
// site_packages
// parent
// child
// two.py
// ```
let parent1 = src.join("parent");
let child1 = parent1.join("child");
let one = child1.join("one.py");
let parent2 = site_packages.join("parent");
let child2 = parent2.join("child");
let two = child2.join("two.py");
db.memory_file_system().write_files([
(&one, "print('Hello, world!')"),
(&two, "print('Hello, world!')"),
])?;
let one_module =
resolve_module(&db, ModuleName::new_static("parent.child.one").unwrap()).unwrap();
assert_eq!(
Some(one_module),
path_to_module(&db, &VfsPath::FileSystem(one))
);
let two_module =
resolve_module(&db, ModuleName::new_static("parent.child.two").unwrap()).unwrap();
assert_eq!(
Some(two_module),
path_to_module(&db, &VfsPath::FileSystem(two))
);
Ok(())
}
#[test]
fn regular_package_in_namespace_package() -> anyhow::Result<()> {
let TestCase {
db,
src,
site_packages,
..
} = create_resolver()?;
// Adopted test case from the [PEP420 examples](https://peps.python.org/pep-0420/#nested-namespace-packages).
// The `src/parent/child` package is a regular package. Therefore, `site_packages/parent/child/two.py` should not be resolved.
// ```
// src
// parent
// child
// one.py
// site_packages
// parent
// child
// two.py
// ```
let parent1 = src.join("parent");
let child1 = parent1.join("child");
let one = child1.join("one.py");
let parent2 = site_packages.join("parent");
let child2 = parent2.join("child");
let two = child2.join("two.py");
db.memory_file_system().write_files([
(&child1.join("__init__.py"), "print('Hello, world!')"),
(&one, "print('Hello, world!')"),
(&two, "print('Hello, world!')"),
])?;
let one_module =
resolve_module(&db, ModuleName::new_static("parent.child.one").unwrap()).unwrap();
assert_eq!(
Some(one_module),
path_to_module(&db, &VfsPath::FileSystem(one))
);
assert_eq!(
None,
resolve_module(&db, ModuleName::new_static("parent.child.two").unwrap())
);
Ok(())
}
#[test]
fn module_search_path_priority() -> anyhow::Result<()> {
let TestCase {
db,
src,
site_packages,
..
} = create_resolver()?;
let foo_src = src.join("foo.py");
let foo_site_packages = site_packages.join("foo.py");
db.memory_file_system()
.write_files([(&foo_src, ""), (&foo_site_packages, "")])?;
let foo_module = resolve_module(&db, ModuleName::new_static("foo").unwrap()).unwrap();
assert_eq!(&src, foo_module.search_path().path());
assert_eq!(&foo_src, foo_module.file().path(&db));
assert_eq!(
Some(foo_module),
path_to_module(&db, &VfsPath::FileSystem(foo_src))
);
assert_eq!(
None,
path_to_module(&db, &VfsPath::FileSystem(foo_site_packages))
);
Ok(())
}
#[test]
#[cfg(target_family = "unix")]
fn symlink() -> anyhow::Result<()> {
let TestCase {
mut db,
src,
site_packages,
custom_typeshed,
} = create_resolver()?;
db.with_os_file_system();
let temp_dir = tempfile::tempdir()?;
let root = FileSystemPath::from_std_path(temp_dir.path()).unwrap();
let src = root.join(src);
let site_packages = root.join(site_packages);
let custom_typeshed = root.join(custom_typeshed);
let foo = src.join("foo.py");
let bar = src.join("bar.py");
std::fs::create_dir_all(src.as_std_path())?;
std::fs::create_dir_all(site_packages.as_std_path())?;
std::fs::create_dir_all(custom_typeshed.as_std_path())?;
std::fs::write(foo.as_std_path(), "")?;
std::os::unix::fs::symlink(foo.as_std_path(), bar.as_std_path())?;
let settings = ModuleResolutionSettings {
extra_paths: vec![],
workspace_root: src.clone(),
site_packages: Some(site_packages),
custom_typeshed: Some(custom_typeshed),
};
set_module_resolution_settings(&mut db, settings);
let foo_module = resolve_module(&db, ModuleName::new_static("foo").unwrap()).unwrap();
let bar_module = resolve_module(&db, ModuleName::new_static("bar").unwrap()).unwrap();
assert_ne!(foo_module, bar_module);
assert_eq!(&src, foo_module.search_path().path());
assert_eq!(&foo, foo_module.file().path(&db));
// `foo` and `bar` shouldn't resolve to the same file
assert_eq!(&src, bar_module.search_path().path());
assert_eq!(&bar, bar_module.file().path(&db));
assert_eq!(&foo, foo_module.file().path(&db));
assert_ne!(&foo_module, &bar_module);
assert_eq!(
Some(foo_module),
path_to_module(&db, &VfsPath::FileSystem(foo))
);
assert_eq!(
Some(bar_module),
path_to_module(&db, &VfsPath::FileSystem(bar))
);
Ok(())
}
#[test]
fn deleting_an_unrealted_file_doesnt_change_module_resolution() -> anyhow::Result<()> {
let TestCase { mut db, src, .. } = create_resolver()?;
let foo_path = src.join("foo.py");
let bar_path = src.join("bar.py");
db.memory_file_system()
.write_files([(&foo_path, "x = 1"), (&bar_path, "y = 2")])?;
let foo_module_name = ModuleName::new_static("foo").unwrap();
let foo_module = resolve_module(&db, foo_module_name.clone()).unwrap();
let bar = system_path_to_file(&db, &bar_path).expect("bar.py to exist");
db.clear_salsa_events();
// Delete `bar.py`
db.memory_file_system().remove_file(&bar_path)?;
bar.touch(&mut db);
// Re-query the foo module. The foo module should still be cached because `bar.py` isn't relevant
// for resolving `foo`.
let foo_module2 = resolve_module(&db, foo_module_name);
assert!(!db
.take_salsa_events()
.iter()
.any(|event| { matches!(event.kind, salsa::EventKind::WillExecute { .. }) }));
assert_eq!(Some(foo_module), foo_module2);
Ok(())
}
#[test]
fn adding_a_file_on_which_the_module_resolution_depends_on_invalidates_the_query(
) -> anyhow::Result<()> {
let TestCase { mut db, src, .. } = create_resolver()?;
let foo_path = src.join("foo.py");
let foo_module_name = ModuleName::new_static("foo").unwrap();
assert_eq!(resolve_module(&db, foo_module_name.clone()), None);
// Now write the foo file
db.memory_file_system().write_file(&foo_path, "x = 1")?;
VfsFile::touch_path(&mut db, &VfsPath::FileSystem(foo_path.clone()));
let foo_file = system_path_to_file(&db, &foo_path).expect("foo.py to exist");
let foo_module = resolve_module(&db, foo_module_name).expect("Foo module to resolve");
assert_eq!(foo_file, foo_module.file());
Ok(())
}
#[test]
fn removing_a_file_that_the_module_resolution_depends_on_invalidates_the_query(
) -> anyhow::Result<()> {
let TestCase { mut db, src, .. } = create_resolver()?;
let foo_path = src.join("foo.py");
let foo_init_path = src.join("foo/__init__.py");
db.memory_file_system()
.write_files([(&foo_path, "x = 1"), (&foo_init_path, "x = 2")])?;
let foo_module_name = ModuleName::new_static("foo").unwrap();
let foo_module = resolve_module(&db, foo_module_name.clone()).expect("foo module to exist");
assert_eq!(&foo_init_path, foo_module.file().path(&db));
// Delete `foo/__init__.py` and the `foo` folder. `foo` should now resolve to `foo.py`
db.memory_file_system().remove_file(&foo_init_path)?;
db.memory_file_system()
.remove_directory(foo_init_path.parent().unwrap())?;
VfsFile::touch_path(&mut db, &VfsPath::FileSystem(foo_init_path.clone()));
let foo_module = resolve_module(&db, foo_module_name).expect("Foo module to resolve");
assert_eq!(&foo_path, foo_module.file().path(&db));
Ok(())
}
}

View File

@@ -0,0 +1,91 @@
pub(crate) mod versions;
#[cfg(test)]
mod tests {
use std::io::{self, Read};
use std::path::Path;
use ruff_db::vendored::VendoredFileSystem;
use ruff_db::vfs::VendoredPath;
// The file path here is hardcoded in this crate's `build.rs` script.
// Luckily this crate will fail to build if this file isn't available at build time.
const TYPESHED_ZIP_BYTES: &[u8] =
include_bytes!(concat!(env!("OUT_DIR"), "/zipped_typeshed.zip"));
#[test]
fn typeshed_zip_created_at_build_time() {
let mut typeshed_zip_archive =
zip::ZipArchive::new(io::Cursor::new(TYPESHED_ZIP_BYTES)).unwrap();
let mut functools_module_stub = typeshed_zip_archive
.by_name("stdlib/functools.pyi")
.unwrap();
assert!(functools_module_stub.is_file());
let mut functools_module_stub_source = String::new();
functools_module_stub
.read_to_string(&mut functools_module_stub_source)
.unwrap();
assert!(functools_module_stub_source.contains("def update_wrapper("));
}
#[test]
fn typeshed_vfs_consistent_with_vendored_stubs() {
let vendored_typeshed_dir = Path::new("vendor/typeshed").canonicalize().unwrap();
let vendored_typeshed_stubs = VendoredFileSystem::new(TYPESHED_ZIP_BYTES).unwrap();
let mut empty_iterator = true;
for entry in walkdir::WalkDir::new(&vendored_typeshed_dir).min_depth(1) {
empty_iterator = false;
let entry = entry.unwrap();
let absolute_path = entry.path();
let file_type = entry.file_type();
let relative_path = absolute_path
.strip_prefix(&vendored_typeshed_dir)
.unwrap_or_else(|_| {
panic!("Expected {absolute_path:?} to be a child of {vendored_typeshed_dir:?}")
});
let vendored_path = <&VendoredPath>::try_from(relative_path)
.unwrap_or_else(|_| panic!("Expected {relative_path:?} to be valid UTF-8"));
assert!(
vendored_typeshed_stubs.exists(vendored_path),
"Expected {vendored_path:?} to exist in the `VendoredFileSystem`!
Vendored file system:
{vendored_typeshed_stubs:#?}
"
);
let vendored_path_kind = vendored_typeshed_stubs
.metadata(vendored_path)
.unwrap_or_else(|| {
panic!(
"Expected metadata for {vendored_path:?} to be retrievable from the `VendoredFileSystem!
Vendored file system:
{vendored_typeshed_stubs:#?}
"
)
})
.kind();
assert_eq!(
vendored_path_kind.is_directory(),
file_type.is_dir(),
"{vendored_path:?} had type {vendored_path_kind:?}, inconsistent with fs path {relative_path:?}: {file_type:?}"
);
}
assert!(
!empty_iterator,
"Expected there to be at least one file or directory in the vendored typeshed stubs!"
);
}
}

View File

@@ -0,0 +1,590 @@
use std::collections::BTreeMap;
use std::fmt;
use std::num::{NonZeroU16, NonZeroUsize};
use std::ops::{RangeFrom, RangeInclusive};
use std::str::FromStr;
use rustc_hash::FxHashMap;
use crate::module::ModuleName;
#[derive(Debug, PartialEq, Eq)]
pub struct TypeshedVersionsParseError {
line_number: NonZeroU16,
reason: TypeshedVersionsParseErrorKind,
}
impl fmt::Display for TypeshedVersionsParseError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let TypeshedVersionsParseError {
line_number,
reason,
} = self;
write!(
f,
"Error while parsing line {line_number} of typeshed's VERSIONS file: {reason}"
)
}
}
impl std::error::Error for TypeshedVersionsParseError {
fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
if let TypeshedVersionsParseErrorKind::IntegerParsingFailure { err, .. } = &self.reason {
Some(err)
} else {
None
}
}
}
#[derive(Debug, PartialEq, Eq)]
pub enum TypeshedVersionsParseErrorKind {
TooManyLines(NonZeroUsize),
UnexpectedNumberOfColons,
InvalidModuleName(String),
UnexpectedNumberOfHyphens,
UnexpectedNumberOfPeriods(String),
IntegerParsingFailure {
version: String,
err: std::num::ParseIntError,
},
}
impl fmt::Display for TypeshedVersionsParseErrorKind {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::TooManyLines(num_lines) => write!(
f,
"File has too many lines ({num_lines}); maximum allowed is {}",
NonZeroU16::MAX
),
Self::UnexpectedNumberOfColons => {
f.write_str("Expected every non-comment line to have exactly one colon")
}
Self::InvalidModuleName(name) => write!(
f,
"Expected all components of '{name}' to be valid Python identifiers"
),
Self::UnexpectedNumberOfHyphens => {
f.write_str("Expected every non-comment line to have exactly one '-' character")
}
Self::UnexpectedNumberOfPeriods(format) => write!(
f,
"Expected all versions to be in the form {{MAJOR}}.{{MINOR}}; got '{format}'"
),
Self::IntegerParsingFailure { version, err } => write!(
f,
"Failed to convert '{version}' to a pair of integers due to {err}",
),
}
}
}
#[derive(Debug, PartialEq, Eq)]
pub struct TypeshedVersions(FxHashMap<ModuleName, PyVersionRange>);
impl TypeshedVersions {
pub fn len(&self) -> usize {
self.0.len()
}
pub fn is_empty(&self) -> bool {
self.0.is_empty()
}
pub fn contains_module(&self, module_name: &ModuleName) -> bool {
self.0.contains_key(module_name)
}
pub fn module_exists_on_version(
&self,
module: ModuleName,
version: impl Into<PyVersion>,
) -> bool {
let version = version.into();
let mut module: Option<ModuleName> = Some(module);
while let Some(module_to_try) = module {
if let Some(range) = self.0.get(&module_to_try) {
return range.contains(version);
}
module = module_to_try.parent();
}
false
}
}
impl FromStr for TypeshedVersions {
type Err = TypeshedVersionsParseError;
fn from_str(s: &str) -> Result<Self, Self::Err> {
let mut map = FxHashMap::default();
for (line_index, line) in s.lines().enumerate() {
// humans expect line numbers to be 1-indexed
let line_number = NonZeroUsize::new(line_index.saturating_add(1)).unwrap();
let Ok(line_number) = NonZeroU16::try_from(line_number) else {
return Err(TypeshedVersionsParseError {
line_number: NonZeroU16::MAX,
reason: TypeshedVersionsParseErrorKind::TooManyLines(line_number),
});
};
let Some(content) = line.split('#').map(str::trim).next() else {
continue;
};
if content.is_empty() {
continue;
}
let mut parts = content.split(':').map(str::trim);
let (Some(module_name), Some(rest), None) = (parts.next(), parts.next(), parts.next())
else {
return Err(TypeshedVersionsParseError {
line_number,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfColons,
});
};
let Some(module_name) = ModuleName::new(module_name) else {
return Err(TypeshedVersionsParseError {
line_number,
reason: TypeshedVersionsParseErrorKind::InvalidModuleName(
module_name.to_string(),
),
});
};
match PyVersionRange::from_str(rest) {
Ok(version) => map.insert(module_name, version),
Err(reason) => {
return Err(TypeshedVersionsParseError {
line_number,
reason,
})
}
};
}
Ok(Self(map))
}
}
impl fmt::Display for TypeshedVersions {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let sorted_items: BTreeMap<&ModuleName, &PyVersionRange> = self.0.iter().collect();
for (module_name, range) in sorted_items {
writeln!(f, "{module_name}: {range}")?;
}
Ok(())
}
}
#[derive(Debug, Clone, Eq, PartialEq)]
enum PyVersionRange {
AvailableFrom(RangeFrom<PyVersion>),
AvailableWithin(RangeInclusive<PyVersion>),
}
impl PyVersionRange {
fn contains(&self, version: PyVersion) -> bool {
match self {
Self::AvailableFrom(inner) => inner.contains(&version),
Self::AvailableWithin(inner) => inner.contains(&version),
}
}
}
impl FromStr for PyVersionRange {
type Err = TypeshedVersionsParseErrorKind;
fn from_str(s: &str) -> Result<Self, Self::Err> {
let mut parts = s.split('-').map(str::trim);
match (parts.next(), parts.next(), parts.next()) {
(Some(lower), Some(""), None) => Ok(Self::AvailableFrom((lower.parse()?)..)),
(Some(lower), Some(upper), None) => {
Ok(Self::AvailableWithin((lower.parse()?)..=(upper.parse()?)))
}
_ => Err(TypeshedVersionsParseErrorKind::UnexpectedNumberOfHyphens),
}
}
}
impl fmt::Display for PyVersionRange {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::AvailableFrom(range_from) => write!(f, "{}-", range_from.start),
Self::AvailableWithin(range_inclusive) => {
write!(f, "{}-{}", range_inclusive.start(), range_inclusive.end())
}
}
}
}
#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub struct PyVersion {
major: u8,
minor: u8,
}
impl FromStr for PyVersion {
type Err = TypeshedVersionsParseErrorKind;
fn from_str(s: &str) -> Result<Self, Self::Err> {
let mut parts = s.split('.').map(str::trim);
let (Some(major), Some(minor), None) = (parts.next(), parts.next(), parts.next()) else {
return Err(TypeshedVersionsParseErrorKind::UnexpectedNumberOfPeriods(
s.to_string(),
));
};
let major = match u8::from_str(major) {
Ok(major) => major,
Err(err) => {
return Err(TypeshedVersionsParseErrorKind::IntegerParsingFailure {
version: s.to_string(),
err,
})
}
};
let minor = match u8::from_str(minor) {
Ok(minor) => minor,
Err(err) => {
return Err(TypeshedVersionsParseErrorKind::IntegerParsingFailure {
version: s.to_string(),
err,
})
}
};
Ok(Self { major, minor })
}
}
impl fmt::Display for PyVersion {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let PyVersion { major, minor } = self;
write!(f, "{major}.{minor}")
}
}
// TODO: unify with the PythonVersion enum in the linter/formatter crates?
#[derive(Copy, Clone, Hash, Debug, PartialEq, Eq, PartialOrd, Ord, Default)]
pub enum SupportedPyVersion {
Py37,
#[default]
Py38,
Py39,
Py310,
Py311,
Py312,
Py313,
}
impl From<SupportedPyVersion> for PyVersion {
fn from(value: SupportedPyVersion) -> Self {
match value {
SupportedPyVersion::Py37 => PyVersion { major: 3, minor: 7 },
SupportedPyVersion::Py38 => PyVersion { major: 3, minor: 8 },
SupportedPyVersion::Py39 => PyVersion { major: 3, minor: 9 },
SupportedPyVersion::Py310 => PyVersion {
major: 3,
minor: 10,
},
SupportedPyVersion::Py311 => PyVersion {
major: 3,
minor: 11,
},
SupportedPyVersion::Py312 => PyVersion {
major: 3,
minor: 12,
},
SupportedPyVersion::Py313 => PyVersion {
major: 3,
minor: 13,
},
}
}
}
#[cfg(test)]
mod tests {
use std::num::{IntErrorKind, NonZeroU16};
use std::path::Path;
use super::*;
use insta::assert_snapshot;
const TYPESHED_STDLIB_DIR: &str = "stdlib";
#[allow(unsafe_code)]
const ONE: NonZeroU16 = unsafe { NonZeroU16::new_unchecked(1) };
#[test]
fn can_parse_vendored_versions_file() {
let versions_data = include_str!(concat!(
env!("CARGO_MANIFEST_DIR"),
"/vendor/typeshed/stdlib/VERSIONS"
));
let versions = TypeshedVersions::from_str(versions_data).unwrap();
assert!(versions.len() > 100);
assert!(versions.len() < 1000);
let asyncio = ModuleName::new_static("asyncio").unwrap();
let asyncio_staggered = ModuleName::new_static("asyncio.staggered").unwrap();
let audioop = ModuleName::new_static("audioop").unwrap();
assert!(versions.contains_module(&asyncio));
assert!(versions.module_exists_on_version(asyncio, SupportedPyVersion::Py310));
assert!(versions.contains_module(&asyncio_staggered));
assert!(
versions.module_exists_on_version(asyncio_staggered.clone(), SupportedPyVersion::Py38)
);
assert!(!versions.module_exists_on_version(asyncio_staggered, SupportedPyVersion::Py37));
assert!(versions.contains_module(&audioop));
assert!(versions.module_exists_on_version(audioop.clone(), SupportedPyVersion::Py312));
assert!(!versions.module_exists_on_version(audioop, SupportedPyVersion::Py313));
}
#[test]
fn typeshed_versions_consistent_with_vendored_stubs() {
const VERSIONS_DATA: &str = include_str!("../../vendor/typeshed/stdlib/VERSIONS");
let vendored_typeshed_dir = Path::new("vendor/typeshed").canonicalize().unwrap();
let vendored_typeshed_versions = TypeshedVersions::from_str(VERSIONS_DATA).unwrap();
let mut empty_iterator = true;
let stdlib_stubs_path = vendored_typeshed_dir.join(TYPESHED_STDLIB_DIR);
for entry in std::fs::read_dir(&stdlib_stubs_path).unwrap() {
empty_iterator = false;
let entry = entry.unwrap();
let absolute_path = entry.path();
let relative_path = absolute_path
.strip_prefix(&stdlib_stubs_path)
.unwrap_or_else(|_| panic!("Expected path to be a child of {stdlib_stubs_path:?} but found {absolute_path:?}"));
let relative_path_str = relative_path.as_os_str().to_str().unwrap_or_else(|| {
panic!("Expected all typeshed paths to be valid UTF-8; got {relative_path:?}")
});
if relative_path_str == "VERSIONS" {
continue;
}
let top_level_module = if let Some(extension) = relative_path.extension() {
// It was a file; strip off the file extension to get the module name:
let extension = extension
.to_str()
.unwrap_or_else(||panic!("Expected all file extensions to be UTF-8; was not true for {relative_path:?}"));
relative_path_str
.strip_suffix(extension)
.and_then(|string| string.strip_suffix('.')).unwrap_or_else(|| {
panic!("Expected path {relative_path_str:?} to end with computed extension {extension:?}")
})
} else {
// It was a directory; no need to do anything to get the module name
relative_path_str
};
let top_level_module = ModuleName::new(top_level_module)
.unwrap_or_else(|| panic!("{top_level_module:?} was not a valid module name!"));
assert!(vendored_typeshed_versions.contains_module(&top_level_module));
}
assert!(
!empty_iterator,
"Expected there to be at least one file or directory in the vendored typeshed stubs"
);
}
#[test]
fn can_parse_mock_versions_file() {
const VERSIONS: &str = "\
# a comment
# some more comment
# yet more comment
# and some more comment
bar: 2.7-3.10
# more comment
bar.baz: 3.1-3.9
foo: 3.8- # trailing comment
";
let parsed_versions = TypeshedVersions::from_str(VERSIONS).unwrap();
assert_eq!(parsed_versions.len(), 3);
assert_snapshot!(parsed_versions.to_string(), @r###"
bar: 2.7-3.10
bar.baz: 3.1-3.9
foo: 3.8-
"###
);
let foo = ModuleName::new_static("foo").unwrap();
let bar = ModuleName::new_static("bar").unwrap();
let bar_baz = ModuleName::new_static("bar.baz").unwrap();
let spam = ModuleName::new_static("spam").unwrap();
assert!(parsed_versions.contains_module(&foo));
assert!(!parsed_versions.module_exists_on_version(foo.clone(), SupportedPyVersion::Py37));
assert!(parsed_versions.module_exists_on_version(foo.clone(), SupportedPyVersion::Py38));
assert!(parsed_versions.module_exists_on_version(foo, SupportedPyVersion::Py311));
assert!(parsed_versions.contains_module(&bar));
assert!(parsed_versions.module_exists_on_version(bar.clone(), SupportedPyVersion::Py37));
assert!(parsed_versions.module_exists_on_version(bar.clone(), SupportedPyVersion::Py310));
assert!(!parsed_versions.module_exists_on_version(bar, SupportedPyVersion::Py311));
assert!(parsed_versions.contains_module(&bar_baz));
assert!(parsed_versions.module_exists_on_version(bar_baz.clone(), SupportedPyVersion::Py37));
assert!(parsed_versions.module_exists_on_version(bar_baz.clone(), SupportedPyVersion::Py39));
assert!(!parsed_versions.module_exists_on_version(bar_baz, SupportedPyVersion::Py310));
assert!(!parsed_versions.contains_module(&spam));
assert!(!parsed_versions.module_exists_on_version(spam.clone(), SupportedPyVersion::Py37));
assert!(!parsed_versions.module_exists_on_version(spam, SupportedPyVersion::Py313));
}
#[test]
fn invalid_huge_versions_file() {
let offset = 100;
let too_many = u16::MAX as usize + offset;
let mut massive_versions_file = String::new();
for i in 0..too_many {
massive_versions_file.push_str(&format!("x{i}: 3.8-\n"));
}
assert_eq!(
TypeshedVersions::from_str(&massive_versions_file),
Err(TypeshedVersionsParseError {
line_number: NonZeroU16::MAX,
reason: TypeshedVersionsParseErrorKind::TooManyLines(
NonZeroUsize::new(too_many + 1 - offset).unwrap()
)
})
);
}
#[test]
fn invalid_typeshed_versions_bad_colon_number() {
assert_eq!(
TypeshedVersions::from_str("foo 3.7"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfColons
})
);
assert_eq!(
TypeshedVersions::from_str("foo:: 3.7"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfColons
})
);
}
#[test]
fn invalid_typeshed_versions_non_identifier_modules() {
assert_eq!(
TypeshedVersions::from_str("not!an!identifier!: 3.7"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::InvalidModuleName(
"not!an!identifier!".to_string()
)
})
);
assert_eq!(
TypeshedVersions::from_str("(also_not).(an_identifier): 3.7"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::InvalidModuleName(
"(also_not).(an_identifier)".to_string()
)
})
);
}
#[test]
fn invalid_typeshed_versions_bad_hyphen_number() {
assert_eq!(
TypeshedVersions::from_str("foo: 3.8"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfHyphens
})
);
assert_eq!(
TypeshedVersions::from_str("foo: 3.8--"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfHyphens
})
);
assert_eq!(
TypeshedVersions::from_str("foo: 3.8--3.9"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfHyphens
})
);
}
#[test]
fn invalid_typeshed_versions_bad_period_number() {
assert_eq!(
TypeshedVersions::from_str("foo: 38-"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfPeriods("38".to_string())
})
);
assert_eq!(
TypeshedVersions::from_str("foo: 3..8-"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfPeriods(
"3..8".to_string()
)
})
);
assert_eq!(
TypeshedVersions::from_str("foo: 3.8-3..11"),
Err(TypeshedVersionsParseError {
line_number: ONE,
reason: TypeshedVersionsParseErrorKind::UnexpectedNumberOfPeriods(
"3..11".to_string()
)
})
);
}
#[test]
fn invalid_typeshed_versions_non_digits() {
let err = TypeshedVersions::from_str("foo: 1.two-").unwrap_err();
assert_eq!(err.line_number, ONE);
let TypeshedVersionsParseErrorKind::IntegerParsingFailure { version, err } = err.reason
else {
panic!()
};
assert_eq!(version, "1.two".to_string());
assert_eq!(*err.kind(), IntErrorKind::InvalidDigit);
let err = TypeshedVersions::from_str("foo: 3.8-four.9").unwrap_err();
assert_eq!(err.line_number, ONE);
let TypeshedVersionsParseErrorKind::IntegerParsingFailure { version, err } = err.reason
else {
panic!()
};
assert_eq!(version, "four.9".to_string());
assert_eq!(*err.kind(), IntErrorKind::InvalidDigit);
}
}

View File

@@ -0,0 +1 @@
114409d49b43ba62a179ebb856fa70a5161f751e

View File

@@ -65,9 +65,9 @@ array: 3.0-
ast: 3.0-
asynchat: 3.0-3.11
asyncio: 3.4-
asyncio.mixins: 3.10-
asyncio.exceptions: 3.8-
asyncio.format_helpers: 3.7-
asyncio.mixins: 3.10-
asyncio.runners: 3.7-
asyncio.staggered: 3.8-
asyncio.taskgroups: 3.11-
@@ -166,7 +166,7 @@ ipaddress: 3.3-
itertools: 3.0-
json: 3.0-
keyword: 3.0-
lib2to3: 3.0-
lib2to3: 3.0-3.12
linecache: 3.0-
locale: 3.0-
logging: 3.0-
@@ -270,6 +270,7 @@ threading: 3.0-
time: 3.0-
timeit: 3.0-
tkinter: 3.0-
tkinter.tix: 3.0-3.12
token: 3.0-
tokenize: 3.0-
tomllib: 3.11-

File diff suppressed because it is too large Load Diff

View File

@@ -201,7 +201,7 @@ class Array(_CData, Generic[_CT]):
# Sized and _CData prevents using _CDataMeta.
def __len__(self) -> int: ...
if sys.version_info >= (3, 9):
def __class_getitem__(cls, item: Any) -> GenericAlias: ...
def __class_getitem__(cls, item: Any, /) -> GenericAlias: ...
def addressof(obj: _CData) -> int: ...
def alignment(obj_or_type: _CData | type[_CData]) -> int: ...

View File

@@ -63,8 +63,7 @@ A_COLOR: int
A_DIM: int
A_HORIZONTAL: int
A_INVIS: int
if sys.platform != "darwin":
A_ITALIC: int
A_ITALIC: int
A_LEFT: int
A_LOW: int
A_NORMAL: int

View File

@@ -45,5 +45,5 @@ class make_scanner:
def __init__(self, context: make_scanner) -> None: ...
def __call__(self, string: str, index: int) -> tuple[Any, int]: ...
def encode_basestring_ascii(s: str) -> str: ...
def encode_basestring_ascii(s: str, /) -> str: ...
def scanstring(string: str, end: int, strict: bool = ...) -> tuple[str, int]: ...

View File

@@ -783,7 +783,7 @@ def ntohl(x: int, /) -> int: ... # param & ret val are 32-bit ints
def ntohs(x: int, /) -> int: ... # param & ret val are 16-bit ints
def htonl(x: int, /) -> int: ... # param & ret val are 32-bit ints
def htons(x: int, /) -> int: ... # param & ret val are 16-bit ints
def inet_aton(ip_string: str, /) -> bytes: ... # ret val 4 bytes in length
def inet_aton(ip_addr: str, /) -> bytes: ... # ret val 4 bytes in length
def inet_ntoa(packed_ip: ReadableBuffer, /) -> str: ...
def inet_pton(address_family: int, ip_string: str, /) -> bytes: ...
def inet_ntop(address_family: int, packed_ip: ReadableBuffer, /) -> str: ...
@@ -797,7 +797,7 @@ if sys.platform != "win32":
def socketpair(family: int = ..., type: int = ..., proto: int = ..., /) -> tuple[socket, socket]: ...
def if_nameindex() -> list[tuple[int, str]]: ...
def if_nametoindex(name: str, /) -> int: ...
def if_nametoindex(oname: str, /) -> int: ...
def if_indextoname(index: int, /) -> str: ...
CAPI: object

View File

@@ -64,19 +64,19 @@ UF_NODUMP: Literal[0x00000001]
UF_NOUNLINK: Literal[0x00000010]
UF_OPAQUE: Literal[0x00000008]
def S_IMODE(mode: int) -> int: ...
def S_IFMT(mode: int) -> int: ...
def S_ISBLK(mode: int) -> bool: ...
def S_ISCHR(mode: int) -> bool: ...
def S_ISDIR(mode: int) -> bool: ...
def S_ISDOOR(mode: int) -> bool: ...
def S_ISFIFO(mode: int) -> bool: ...
def S_ISLNK(mode: int) -> bool: ...
def S_ISPORT(mode: int) -> bool: ...
def S_ISREG(mode: int) -> bool: ...
def S_ISSOCK(mode: int) -> bool: ...
def S_ISWHT(mode: int) -> bool: ...
def filemode(mode: int) -> str: ...
def S_IMODE(mode: int, /) -> int: ...
def S_IFMT(mode: int, /) -> int: ...
def S_ISBLK(mode: int, /) -> bool: ...
def S_ISCHR(mode: int, /) -> bool: ...
def S_ISDIR(mode: int, /) -> bool: ...
def S_ISDOOR(mode: int, /) -> bool: ...
def S_ISFIFO(mode: int, /) -> bool: ...
def S_ISLNK(mode: int, /) -> bool: ...
def S_ISPORT(mode: int, /) -> bool: ...
def S_ISREG(mode: int, /) -> bool: ...
def S_ISSOCK(mode: int, /) -> bool: ...
def S_ISWHT(mode: int, /) -> bool: ...
def filemode(mode: int, /) -> str: ...
if sys.platform == "win32":
IO_REPARSE_TAG_SYMLINK: int
@@ -101,3 +101,17 @@ if sys.platform == "win32":
FILE_ATTRIBUTE_SYSTEM: Literal[4]
FILE_ATTRIBUTE_TEMPORARY: Literal[256]
FILE_ATTRIBUTE_VIRTUAL: Literal[65536]
if sys.version_info >= (3, 13):
SF_SETTABLE: Literal[0x3FFF0000]
# https://github.com/python/cpython/issues/114081#issuecomment-2119017790
# SF_RESTRICTED: Literal[0x00080000]
SF_FIRMLINK: Literal[0x00800000]
SF_DATALESS: Literal[0x40000000]
SF_SUPPORTED: Literal[0x9F0000]
SF_SYNTHETIC: Literal[0xC0000000]
UF_TRACKED: Literal[0x00000040]
UF_DATAVAULT: Literal[0x00000080]
UF_SETTABLE: Literal[0x0000FFFF]

View File

@@ -1,5 +1,7 @@
import sys
from collections.abc import Callable
from typing import Any, ClassVar, Literal, final
from typing_extensions import TypeAlias
# _tkinter is meant to be only used internally by tkinter, but some tkinter
# functions e.g. return _tkinter.Tcl_Obj objects. Tcl_Obj represents a Tcl
@@ -30,6 +32,8 @@ class Tcl_Obj:
class TclError(Exception): ...
_TkinterTraceFunc: TypeAlias = Callable[[tuple[str, ...]], object]
# This class allows running Tcl code. Tkinter uses it internally a lot, and
# it's often handy to drop a piece of Tcl code into a tkinter program. Example:
#
@@ -86,6 +90,9 @@ class TkappType:
def unsetvar(self, *args, **kwargs): ...
def wantobjects(self, *args, **kwargs): ...
def willdispatch(self): ...
if sys.version_info >= (3, 12):
def gettrace(self, /) -> _TkinterTraceFunc | None: ...
def settrace(self, func: _TkinterTraceFunc | None, /) -> None: ...
# These should be kept in sync with tkinter.tix constants, except ALL_EVENTS which doesn't match TCL_ALL_EVENTS
ALL_EVENTS: Literal[-3]

View File

@@ -326,6 +326,8 @@ class structseq(Generic[_T_co]):
# but only has any meaning if you supply it a dict where the keys are strings.
# https://github.com/python/typeshed/pull/6560#discussion_r767149830
def __new__(cls: type[Self], sequence: Iterable[_T_co], dict: dict[str, Any] = ...) -> Self: ...
if sys.version_info >= (3, 13):
def __replace__(self: Self, **kwargs: Any) -> Self: ...
# Superset of typing.AnyStr that also includes LiteralString
AnyOrLiteralStr = TypeVar("AnyOrLiteralStr", str, bytes, LiteralString) # noqa: Y001

Some files were not shown because too many files have changed in this diff Show More