Compare commits

...

47 Commits

Author SHA1 Message Date
Douglas Creager
a59fae85cc here 2025-12-03 16:38:04 -05:00
Douglas Creager
705e4725ad generic_context should work for callables too 2025-12-03 16:38:04 -05:00
Douglas Creager
3d73506e05 make PartialSpec an enum 2025-12-03 16:38:04 -05:00
Douglas Creager
af67d7307a debug 2025-12-03 16:38:04 -05:00
Douglas Creager
1e33d25d1c fix test 2025-12-03 16:38:01 -05:00
Douglas Creager
b90cdfc2f7 generic 2025-12-03 16:36:21 -05:00
Douglas Creager
94aca37ca8 skip non-inferable 2025-12-03 16:30:44 -05:00
Douglas Creager
75e9d66d4b self 2025-12-03 12:37:04 -05:00
Douglas Creager
3bcca62472 doc 2025-12-03 12:12:00 -05:00
Douglas Creager
85e6143e07 use self annotation in synthesized __init__ callable 2025-12-03 12:09:04 -05:00
Douglas Creager
77ce24a5bf allow multiple overloads/callables when inferring 2025-12-03 12:04:59 -05:00
Douglas Creager
db5834dfd7 add failing tests 2025-12-03 12:04:00 -05:00
Douglas Creager
2e46c8de06 Merge remote-tracking branch 'origin/main' into dcreager/callable-return
* origin/main:
  [ty] Reachability constraints: minor documentation fixes (#21774)
  [ty] Fix non-determinism in `ConstraintSet.specialize_constrained` (#21744)
  [ty] Improve `@override`, `@final` and Liskov checks in cases where there are multiple reachable definitions (#21767)
  [ty] Extend `invalid-explicit-override` to also cover properties decorated with `@override` that do not override anything (#21756)
  [ty] Enable LRU collection for parsed module (#21749)
  [ty] Support typevar-specialized dynamic types in generic type aliases (#21730)
  Add token based `parenthesized_ranges` implementation (#21738)
  [ty] Default-specialization of generic type aliases (#21765)
  [ty] Suppress false positives when `dataclasses.dataclass(...)(cls)` is called imperatively (#21729)
  [syntax-error] Default type parameter followed by non-default type parameter (#21657)
2025-12-03 10:48:36 -05:00
David Peter
d6e472f297 [ty] Reachability constraints: minor documentation fixes (#21774) 2025-12-03 16:40:11 +01:00
Douglas Creager
45842cc034 [ty] Fix non-determinism in ConstraintSet.specialize_constrained (#21744)
This fixes a non-determinism that we were seeing in the constraint set
tests in https://github.com/astral-sh/ruff/pull/21715.

In this test, we create the following constraint set, and then try to
create a specialization from it:

```
(T@constrained_by_gradual_list = list[Base])
  ∨
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
```

That is, `T` is either specifically `list[Base]`, or it's any `list`.
Our current heuristics say that, absent other restrictions, we should
specialize `T` to the more specific type (`list[Base]`).

In the correct test output, we end up creating a BDD that looks like
this:

```
(T@constrained_by_gradual_list = list[Base])
┡━₁ always
└─₀ (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
    ┡━₁ always
    └─₀ never
```

In the incorrect output, the BDD looks like this:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ never
```

The difference is the ordering of the two individual constraints. Both
constraints appear in the first BDD, but the second BDD only contains `T
is any list`. If we were to force the second BDD to contain both
constraints, it would look like this:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ (T@constrained_by_gradual_list = list[Base])
    ┡━₁ always
    └─₀ never
```

This is the standard shape for an OR of two constraints. However! Those
two constraints are not independent of each other! If `T` is
specifically `list[Base]`, then it's definitely also "any `list`". From
that, we can infer the contrapositive: that if `T` is not any list, then
it cannot be `list[Base]` specifically. When we encounter impossible
situations like that, we prune that path in the BDD, and treat it as
`false`. That rewrites the second BDD to the following:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ (T@constrained_by_gradual_list = list[Base])
    ┡━₁ never   <-- IMPOSSIBLE, rewritten to never
    └─₀ never
```

We then would see that that BDD node is redundant, since both of its
outgoing edges point at the `never` node. Our BDDs are _reduced_, which
means we have to remove that redundant node, resulting in the BDD we saw
above:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ never       <-- redundant node removed
```

The end result is that we were "forgetting" about the `T = list[Base]`
constraint, but only for some BDD variable orderings.

To fix this, I'm leaning in to the fact that our BDDs really do need to
"remember" all of the constraints that they were created with. Some
combinations might not be possible, but we now have the sequent map,
which is quite good at detecting and pruning those.

So now our BDDs are _quasi-reduced_, which just means that redundant
nodes are allowed. (At first I was worried that allowing redundant nodes
would be an unsound "fix the glitch". But it turns out they're real!
[This](https://ieeexplore.ieee.org/abstract/document/130209) is the
paper that introduces them, though it's very difficult to read. Knuth
mentions them in §7.1.4 of
[TAOCP](https://course.khoury.northeastern.edu/csu690/ssl/bdd-knuth.pdf),
and [this paper](https://par.nsf.gov/servlets/purl/10128966) has a nice
short summary of them in §2.)

While we're here, I've added a bunch of `debug` and `trace` level log
messages to the constraint set implementation. I was getting tired of
having to add these by hands over and over. To enable them, just set
`TY_LOG` in your environment, e.g.

```sh
env TY_LOG=ty_python_semantic::types::constraints::SequentMap=trace ty check ...
```

[Note, this has an `internal` label because are still not using
`specialize_constrained` in anything user-facing yet.]
2025-12-03 10:19:39 -05:00
Alex Waygood
cd079bd92e [ty] Improve @override, @final and Liskov checks in cases where there are multiple reachable definitions (#21767) 2025-12-03 12:51:36 +00:00
Alex Waygood
5756b3809c [ty] Extend invalid-explicit-override to also cover properties decorated with @override that do not override anything (#21756) 2025-12-03 11:27:47 +00:00
Micha Reiser
92c5f62ec0 [ty] Enable LRU collection for parsed module (#21749) 2025-12-03 12:16:18 +01:00
David Peter
21e5a57296 [ty] Support typevar-specialized dynamic types in generic type aliases (#21730)
## Summary

For a type alias like the one below, where `UnknownClass` is something
with a dynamic type, we previously lost track of the fact that this
dynamic type was explicitly specialized *with a type variable*. If that
alias is then later explicitly specialized itself (`MyAlias[int]`), we
would miscount the number of legacy type variables and emit a
`invalid-type-arguments` diagnostic
([playground](https://play.ty.dev/886ae6cc-86c3-4304-a365-510d29211f85)).
```py
T = TypeVar("T")

MyAlias: TypeAlias = UnknownClass[T] | None
```
The solution implemented here is not pretty, but we can hopefully get
rid of it via https://github.com/astral-sh/ty/issues/1711. Also, once we
properly support `ParamSpec` and `Concatenate`, we should be able to
remove some of this code.

This addresses many of the `invalid-type-arguments` false-positives in
https://github.com/astral-sh/ty/issues/1685. With this change, there are
still some diagnostics of this type left. Instead of implementing even
more (rather sophisticated) workarounds for these cases as well, it
might be much easier to wait for full `ParamSpec`/`Concatenate` support
and then try again.

A disadvantage of this implementation is that we lose track of some
`@Todo` types and replace them with `Unknown`. We could spend more
effort and try to preserve them, but I'm unsure if this is the best use
of our time right now.

## Test Plan

New Markdown tests.
2025-12-03 10:00:02 +01:00
Denys Zhak
f4e4229683 Add token based parenthesized_ranges implementation (#21738)
Co-authored-by: Micha Reiser <micha@reiser.io>
2025-12-03 08:15:17 +00:00
David Peter
e6ddeed386 [ty] Default-specialization of generic type aliases (#21765)
## Summary

Implement default-specialization of generic type aliases (implicit or
PEP-613) if they are used in a type expression without an explicit
specialization.

closes https://github.com/astral-sh/ty/issues/1690

## Typing conformance

```diff
-generics_defaults_specialization.py:26:5: error[type-assertion-failure] Type `SomethingWithNoDefaults[int, str]` does not match asserted type `SomethingWithNoDefaults[int, DefaultStrT]`
```

That's exactly what we want ✔️ 

All other tests in this file pass as well, with the exception of this
assertion, which is just wrong (at least according to our
interpretation, `type[Bar] != <class 'Bar'>`). I checked that we do
correctly default-specialize the type parameter which is not displayed
in the diagnostic that we raise.
```py
class Bar(SubclassMe[int, DefaultStrT]): ...

assert_type(Bar, type[Bar[str]])  # ty: Type `type[Bar[str]]` does not match asserted type `<class 'Bar'>`
```

## Ecosystem impact

Looks like I should have included this last week 😎 

## Test Plan

Updated pre-existing tests and add a few new ones.
2025-12-03 09:10:45 +01:00
Alex Waygood
c5b8d551df [ty] Suppress false positives when dataclasses.dataclass(...)(cls) is called imperatively (#21729)
Fixes https://github.com/astral-sh/ty/issues/1705
2025-12-03 08:05:25 +00:00
Bhuminjay Soni
f68080b55e [syntax-error] Default type parameter followed by non-default type parameter (#21657)
## Summary

This PR implements syntax error where a default type parameter is
followed by a non-default type parameter.
https://github.com/astral-sh/ruff/issues/17412#issuecomment-3584088217


## Test Plan

I have written inline tests as directed in #17412

---------

Signed-off-by: 11happy <bhuminjaysoni@gmail.com>
Signed-off-by: 11happy <soni5happy@gmail.com>
2025-12-03 12:01:31 +05:30
Douglas Creager
d3fd988337 fix tests 2025-12-02 21:49:03 -05:00
Douglas Creager
a0f64bd0ae even more hack 2025-12-02 21:41:55 -05:00
Douglas Creager
beb2956a14 carry over failing test from conformance suite 2025-12-02 21:32:02 -05:00
Douglas Creager
58c67fd4cd don't create T ≤ T constraints 2025-12-02 19:01:08 -05:00
Douglas Creager
a303b7a8aa Merge remote-tracking branch 'origin/main' into dcreager/callable-return
* origin/main:
  new module for parsing ranged suppressions (#21441)
  [ty] `type[T]` is assignable to an inferable typevar (#21766)
  Fix syntax error false positives for `await` outside functions (#21763)
  [ty] Improve diagnostics for unsupported comparison operations (#21737)
2025-12-02 18:42:43 -05:00
Amethyst Reese
abaa49f552 new module for parsing ranged suppressions (#21441)
This adds a new `suppression` module to the `ruff_linter` crate, similar
to the suppression
module for ty, to parse comments for ruff suppression directives, such
as `# ruff: disable[CODE]`.
2025-12-02 15:39:59 -08:00
Douglas Creager
30452586ad clippity bippity 2025-12-02 18:27:16 -05:00
Ibraheem Ahmed
7b0aab1696 [ty] type[T] is assignable to an inferable typevar (#21766)
## Summary

Resolves https://github.com/astral-sh/ty/issues/1712.
2025-12-02 18:25:09 -05:00
Douglas Creager
7bbf839325 hackity hack 2025-12-02 18:24:15 -05:00
Brent Westbrook
2250fa6f98 Fix syntax error false positives for await outside functions (#21763)
## Summary

Fixes #21750 and a related bug in `PLE1142`. We were not properly
considering generators to be valid `await` contexts, which caused the
`F704` issue. One of the tests I added for this also uncovered an issue
in `PLE1142` for comprehensions nested within async generators because
we were only checking the current scope rather than traversing the
nested context.

## Test Plan

Both of these rules are implemented as semantic syntax errors, so I
added tests (and fixes) in both Ruff and ty.
2025-12-02 21:02:02 +00:00
Douglas Creager
957304ec15 mdlint 2025-12-02 15:40:43 -05:00
Alex Waygood
392a8e4e50 [ty] Improve diagnostics for unsupported comparison operations (#21737) 2025-12-02 19:58:45 +00:00
Douglas Creager
d88120b187 mark these as TODO 2025-12-02 14:46:29 -05:00
Douglas Creager
2b949b3e67 Merge remote-tracking branch 'origin/main' into dcreager/callable-return
* origin/main: (67 commits)
  Move `Token`, `TokenKind` and `Tokens` to `ruff-python-ast` (#21760)
  [ty] Don't confuse multiple occurrences of `typing.Self` when binding bound methods (#21754)
  Use our org-wide Renovate preset (#21759)
  Delete `my-script.py` (#21751)
  [ty] Move `all_members`, and related types/routines, out of `ide_support.rs` (#21695)
  [ty] Fix find-references for import aliases (#21736)
  [ty] add tests for workspaces (#21741)
  [ty] Stop testing the (brittle) constraint set display implementation (#21743)
  [ty] Use generator over list comprehension to avoid cast (#21748)
  [ty] Add a diagnostic for prohibited `NamedTuple` attribute overrides (#21717)
  [ty] Fix subtyping with `type[T]` and unions (#21740)
  Use `npm ci --ignore-scripts` everywhere (#21742)
  [`flake8-simplify`] Fix truthiness assumption for non-iterable arguments in tuple/list/set calls (`SIM222`, `SIM223`) (#21479)
  [`flake8-use-pathlib`] Mark fixes unsafe for return type changes (`PTH104`, `PTH105`, `PTH109`, `PTH115`) (#21440)
  [ty] Fix auto-import code action to handle pre-existing import
  Enable PEP 740 attestations when publishing to PyPI (#21735)
  [ty] Fix find references for type defined in stub (#21732)
  Use OIDC instead of codspeed token (#21719)
  [ty] Exclude `typing_extensions` from completions unless it's really available
  [ty] Fix false positives for `class F(Generic[*Ts]): ...` (#21723)
  ...
2025-12-02 14:23:15 -05:00
Micha Reiser
515de2d062 Move Token, TokenKind and Tokens to ruff-python-ast (#21760) 2025-12-02 20:10:46 +01:00
Douglas Creager
2c6267436f clean up the diff 2025-11-26 18:35:15 -05:00
Douglas Creager
fedc75463b this gets recursively expanded now 2025-11-26 18:35:15 -05:00
Douglas Creager
9950c126fe these need to be positional only to be assignable 2025-11-26 18:35:15 -05:00
Douglas Creager
b7fb6797b4 it works! 2025-11-26 18:35:15 -05:00
Douglas Creager
fc2f17508b use constraint set assignable 2025-11-26 18:35:15 -05:00
Douglas Creager
20ecb561bb add ConstraintSetAssignability relation 2025-11-26 18:35:15 -05:00
Douglas Creager
3b509e9015 it's a start 2025-11-26 18:35:15 -05:00
Douglas Creager
998b20f078 add for_each_path 2025-11-26 18:35:15 -05:00
Douglas Creager
544dafa66e add more sequents 2025-11-26 18:35:15 -05:00
168 changed files with 7256 additions and 2357 deletions

1
Cargo.lock generated
View File

@@ -3124,6 +3124,7 @@ dependencies = [
"bitflags 2.10.0",
"clap",
"colored 3.0.0",
"compact_str",
"fern",
"glob",
"globset",

View File

@@ -6,7 +6,8 @@ use criterion::{
use ruff_benchmark::{
LARGE_DATASET, NUMPY_CTYPESLIB, NUMPY_GLOBALS, PYDANTIC_TYPES, TestCase, UNICODE_PYPINYIN,
};
use ruff_python_parser::{Mode, TokenKind, lexer};
use ruff_python_ast::token::TokenKind;
use ruff_python_parser::{Mode, lexer};
#[cfg(target_os = "windows")]
#[global_allocator]

View File

@@ -21,7 +21,11 @@ use crate::source::source_text;
/// reflected in the changed AST offsets.
/// The other reason is that Ruff's AST doesn't implement `Eq` which Salsa requires
/// for determining if a query result is unchanged.
#[salsa::tracked(returns(ref), no_eq, heap_size=ruff_memory_usage::heap_size)]
///
/// The LRU capacity of 200 was picked without any empirical evidence that it's optimal,
/// instead it's a wild guess that it should be unlikely that incremental changes involve
/// more than 200 modules. Parsed ASTs within the same revision are never evicted by Salsa.
#[salsa::tracked(returns(ref), no_eq, heap_size=ruff_memory_usage::heap_size, lru=200)]
pub fn parsed_module(db: &dyn Db, file: File) -> ParsedModule {
let _span = tracing::trace_span!("parsed_module", ?file).entered();
@@ -92,14 +96,9 @@ impl ParsedModule {
self.inner.store(None);
}
/// Returns the pointer address of this [`ParsedModule`].
///
/// The pointer uniquely identifies the module within the current Salsa revision,
/// regardless of whether particular [`ParsedModuleRef`] instances are garbage collected.
pub fn addr(&self) -> usize {
// Note that the outer `Arc` in `inner` is stable across garbage collection, while the inner
// `Arc` within the `ArcSwap` may change.
Arc::as_ptr(&self.inner).addr()
/// Returns the file to which this module belongs.
pub fn file(&self) -> File {
self.file
}
}

View File

@@ -35,6 +35,7 @@ anyhow = { workspace = true }
bitflags = { workspace = true }
clap = { workspace = true, features = ["derive", "string"], optional = true }
colored = { workspace = true }
compact_str = { workspace = true }
fern = { workspace = true }
glob = { workspace = true }
globset = { workspace = true }

View File

@@ -17,3 +17,24 @@ def _():
# Valid yield scope
yield 3
# await is valid in any generator, sync or async
(await cor async for cor in f()) # ok
(await cor for cor in f()) # ok
# but not in comprehensions
[await cor async for cor in f()] # F704
{await cor async for cor in f()} # F704
{await cor: 1 async for cor in f()} # F704
[await cor for cor in f()] # F704
{await cor for cor in f()} # F704
{await cor: 1 for cor in f()} # F704
# or in the iterator of an async generator, which is evaluated in the parent
# scope
(cor async for cor in await f()) # F704
(await cor async for cor in [await c for c in f()]) # F704
# this is also okay because the comprehension is within the generator scope
([await c for c in cor] async for cor in f()) # ok

View File

@@ -3,3 +3,5 @@ def func():
# Top-level await
await 1
([await c for c in cor] async for cor in func()) # ok

View File

@@ -35,6 +35,7 @@ use ruff_python_ast::helpers::{collect_import_from_member, is_docstring_stmt, to
use ruff_python_ast::identifier::Identifier;
use ruff_python_ast::name::QualifiedName;
use ruff_python_ast::str::Quote;
use ruff_python_ast::token::Tokens;
use ruff_python_ast::visitor::{Visitor, walk_except_handler, walk_pattern};
use ruff_python_ast::{
self as ast, AnyParameterRef, ArgOrKeyword, Comprehension, ElifElseClause, ExceptHandler, Expr,
@@ -48,7 +49,7 @@ use ruff_python_parser::semantic_errors::{
SemanticSyntaxChecker, SemanticSyntaxContext, SemanticSyntaxError, SemanticSyntaxErrorKind,
};
use ruff_python_parser::typing::{AnnotationKind, ParsedAnnotation, parse_type_annotation};
use ruff_python_parser::{ParseError, Parsed, Tokens};
use ruff_python_parser::{ParseError, Parsed};
use ruff_python_semantic::all::{DunderAllDefinition, DunderAllFlags};
use ruff_python_semantic::analyze::{imports, typing};
use ruff_python_semantic::{
@@ -746,6 +747,7 @@ impl SemanticSyntaxContext for Checker<'_> {
| SemanticSyntaxErrorKind::LoadBeforeNonlocalDeclaration { .. }
| SemanticSyntaxErrorKind::NonlocalAndGlobal(_)
| SemanticSyntaxErrorKind::AnnotatedGlobal(_)
| SemanticSyntaxErrorKind::TypeParameterDefaultOrder(_)
| SemanticSyntaxErrorKind::AnnotatedNonlocal(_) => {
self.semantic_errors.borrow_mut().push(error);
}
@@ -779,6 +781,10 @@ impl SemanticSyntaxContext for Checker<'_> {
match scope.kind {
ScopeKind::Class(_) => return false,
ScopeKind::Function(_) | ScopeKind::Lambda(_) => return true,
ScopeKind::Generator {
kind: GeneratorKind::Generator,
..
} => return true,
ScopeKind::Generator { .. }
| ScopeKind::Module
| ScopeKind::Type
@@ -828,14 +834,19 @@ impl SemanticSyntaxContext for Checker<'_> {
self.source_type.is_ipynb()
}
fn in_generator_scope(&self) -> bool {
matches!(
&self.semantic.current_scope().kind,
ScopeKind::Generator {
kind: GeneratorKind::Generator,
..
fn in_generator_context(&self) -> bool {
for scope in self.semantic.current_scopes() {
if matches!(
scope.kind,
ScopeKind::Generator {
kind: GeneratorKind::Generator,
..
}
) {
return true;
}
)
}
false
}
fn in_loop_context(&self) -> bool {

View File

@@ -1,6 +1,6 @@
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_codegen::Stylist;
use ruff_python_index::Indexer;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextRange};

View File

@@ -4,9 +4,9 @@ use std::path::Path;
use ruff_notebook::CellOffsets;
use ruff_python_ast::PySourceType;
use ruff_python_ast::token::Tokens;
use ruff_python_codegen::Stylist;
use ruff_python_index::Indexer;
use ruff_python_parser::Tokens;
use crate::Locator;
use crate::directives::TodoComment;

View File

@@ -5,8 +5,8 @@ use std::str::FromStr;
use bitflags::bitflags;
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_index::Indexer;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_trivia::CommentRanges;
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};

View File

@@ -5,8 +5,8 @@ use std::iter::FusedIterator;
use std::slice::Iter;
use ruff_python_ast::statement_visitor::{StatementVisitor, walk_stmt};
use ruff_python_ast::token::{Token, TokenKind, Tokens};
use ruff_python_ast::{self as ast, Stmt, Suite};
use ruff_python_parser::{Token, TokenKind, Tokens};
use ruff_source_file::UniversalNewlineIterator;
use ruff_text_size::{Ranged, TextSize};

View File

@@ -9,10 +9,11 @@ use anyhow::Result;
use libcst_native as cst;
use ruff_diagnostics::Edit;
use ruff_python_ast::token::Tokens;
use ruff_python_ast::{self as ast, Expr, ModModule, Stmt};
use ruff_python_codegen::Stylist;
use ruff_python_importer::Insertion;
use ruff_python_parser::{Parsed, Tokens};
use ruff_python_parser::Parsed;
use ruff_python_semantic::{
ImportedName, MemberNameImport, ModuleNameImport, NameImport, SemanticModel,
};

View File

@@ -46,6 +46,7 @@ pub mod rule_selector;
pub mod rules;
pub mod settings;
pub mod source_kind;
pub mod suppression;
mod text_helpers;
pub mod upstream_categories;
mod violation;

View File

@@ -1,6 +1,6 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_index::Indexer;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_text_size::{Ranged, TextRange};
use crate::Locator;

View File

@@ -3,7 +3,7 @@ use ruff_python_ast as ast;
use ruff_python_ast::ExprGenerator;
use ruff_python_ast::comparable::ComparableExpr;
use ruff_python_ast::parenthesize::parenthesized_range;
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::checkers::ast::Checker;

View File

@@ -3,7 +3,7 @@ use ruff_python_ast as ast;
use ruff_python_ast::ExprGenerator;
use ruff_python_ast::comparable::ComparableExpr;
use ruff_python_ast::parenthesize::parenthesized_range;
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::checkers::ast::Checker;

View File

@@ -1,7 +1,7 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast as ast;
use ruff_python_ast::parenthesize::parenthesized_range;
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::checkers::ast::Checker;

View File

@@ -3,8 +3,8 @@ use std::borrow::Cow;
use itertools::Itertools;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::StringFlags;
use ruff_python_ast::token::{Token, TokenKind, Tokens};
use ruff_python_index::Indexer;
use ruff_python_parser::{Token, TokenKind, Tokens};
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextLen, TextRange};

View File

@@ -1,6 +1,6 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_ast::{self as ast, Expr};
use ruff_python_parser::{TokenKind, Tokens};
use ruff_text_size::{Ranged, TextLen, TextSize};
use crate::checkers::ast::Checker;

View File

@@ -4,10 +4,10 @@ use ruff_diagnostics::Applicability;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::helpers::{is_const_false, is_const_true};
use ruff_python_ast::stmt_if::elif_else_range;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::whitespace::indentation;
use ruff_python_ast::{self as ast, Decorator, ElifElseClause, Expr, Stmt};
use ruff_python_parser::TokenKind;
use ruff_python_semantic::SemanticModel;
use ruff_python_semantic::analyze::visibility::is_property;
use ruff_python_trivia::{SimpleTokenKind, SimpleTokenizer, is_python_whitespace};

View File

@@ -1,5 +1,5 @@
use ruff_python_ast::token::Tokens;
use ruff_python_ast::{self as ast, Stmt};
use ruff_python_parser::Tokens;
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextRange};

View File

@@ -1,5 +1,5 @@
use ruff_python_ast::Stmt;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_trivia::PythonWhitespace;
use ruff_source_file::UniversalNewlines;
use ruff_text_size::Ranged;

View File

@@ -11,8 +11,8 @@ use comments::Comment;
use normalize::normalize_imports;
use order::order_imports;
use ruff_python_ast::PySourceType;
use ruff_python_ast::token::Tokens;
use ruff_python_codegen::Stylist;
use ruff_python_parser::Tokens;
use settings::Settings;
use types::EitherImport::{Import, ImportFrom};
use types::{AliasData, ImportBlock, TrailingComma};

View File

@@ -1,11 +1,11 @@
use itertools::{EitherOrBoth, Itertools};
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::Tokens;
use ruff_python_ast::whitespace::trailing_lines_end;
use ruff_python_ast::{PySourceType, PythonVersion, Stmt};
use ruff_python_codegen::Stylist;
use ruff_python_index::Indexer;
use ruff_python_parser::Tokens;
use ruff_python_trivia::{PythonWhitespace, leading_indentation, textwrap::indent};
use ruff_source_file::{LineRanges, UniversalNewlines};
use ruff_text_size::{Ranged, TextRange};

View File

@@ -1,4 +1,4 @@
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
/// Returns `true` if the name should be considered "ambiguous".
pub(super) fn is_ambiguous_name(name: &str) -> bool {

View File

@@ -8,10 +8,10 @@ use itertools::Itertools;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_notebook::CellOffsets;
use ruff_python_ast::PySourceType;
use ruff_python_ast::token::TokenIterWithContext;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::token::Tokens;
use ruff_python_codegen::Stylist;
use ruff_python_parser::TokenIterWithContext;
use ruff_python_parser::TokenKind;
use ruff_python_parser::Tokens;
use ruff_python_trivia::PythonWhitespace;
use ruff_source_file::{LineRanges, UniversalNewlines};
use ruff_text_size::TextRange;

View File

@@ -1,8 +1,8 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_notebook::CellOffsets;
use ruff_python_ast::PySourceType;
use ruff_python_ast::token::{TokenIterWithContext, TokenKind, Tokens};
use ruff_python_index::Indexer;
use ruff_python_parser::{TokenIterWithContext, TokenKind, Tokens};
use ruff_text_size::{Ranged, TextSize};
use crate::Locator;

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange};
use crate::AlwaysFixableViolation;

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::TextRange;
use crate::Violation;

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::Ranged;
use crate::Edit;

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::Ranged;
use crate::checkers::ast::LintContext;

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange};
use crate::checkers::ast::LintContext;

View File

@@ -9,7 +9,7 @@ pub(crate) use missing_whitespace::*;
pub(crate) use missing_whitespace_after_keyword::*;
pub(crate) use missing_whitespace_around_operator::*;
pub(crate) use redundant_backslash::*;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_trivia::is_python_whitespace;
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};
pub(crate) use space_around_operator::*;

View File

@@ -1,6 +1,6 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::TokenKind;
use ruff_python_index::Indexer;
use ruff_python_parser::TokenKind;
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextRange, TextSize};

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange};
use crate::checkers::ast::LintContext;

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::checkers::ast::LintContext;

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_python_trivia::PythonWhitespace;
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::checkers::ast::LintContext;

View File

@@ -3,7 +3,7 @@ use std::iter::Peekable;
use itertools::Itertools;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_notebook::CellOffsets;
use ruff_python_parser::{Token, TokenKind, Tokens};
use ruff_python_ast::token::{Token, TokenKind, Tokens};
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::{AlwaysFixableViolation, Edit, Fix, checkers::ast::LintContext};

View File

@@ -2,8 +2,8 @@ use anyhow::{Error, bail};
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::helpers;
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_ast::{CmpOp, Expr};
use ruff_python_parser::{TokenKind, Tokens};
use ruff_text_size::{Ranged, TextRange};
use crate::checkers::ast::Checker;

View File

@@ -3,8 +3,8 @@ use itertools::Itertools;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::helpers::contains_effect;
use ruff_python_ast::parenthesize::parenthesized_range;
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_ast::{self as ast, Stmt};
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_semantic::Binding;
use ruff_text_size::{Ranged, TextRange, TextSize};

View File

@@ -37,3 +37,88 @@ F704 `await` statement outside of a function
12 |
13 | def _():
|
F704 `await` statement outside of a function
--> F704.py:27:2
|
26 | # but not in comprehensions
27 | [await cor async for cor in f()] # F704
| ^^^^^^^^^
28 | {await cor async for cor in f()} # F704
29 | {await cor: 1 async for cor in f()} # F704
|
F704 `await` statement outside of a function
--> F704.py:28:2
|
26 | # but not in comprehensions
27 | [await cor async for cor in f()] # F704
28 | {await cor async for cor in f()} # F704
| ^^^^^^^^^
29 | {await cor: 1 async for cor in f()} # F704
30 | [await cor for cor in f()] # F704
|
F704 `await` statement outside of a function
--> F704.py:29:2
|
27 | [await cor async for cor in f()] # F704
28 | {await cor async for cor in f()} # F704
29 | {await cor: 1 async for cor in f()} # F704
| ^^^^^^^^^
30 | [await cor for cor in f()] # F704
31 | {await cor for cor in f()} # F704
|
F704 `await` statement outside of a function
--> F704.py:30:2
|
28 | {await cor async for cor in f()} # F704
29 | {await cor: 1 async for cor in f()} # F704
30 | [await cor for cor in f()] # F704
| ^^^^^^^^^
31 | {await cor for cor in f()} # F704
32 | {await cor: 1 for cor in f()} # F704
|
F704 `await` statement outside of a function
--> F704.py:31:2
|
29 | {await cor: 1 async for cor in f()} # F704
30 | [await cor for cor in f()] # F704
31 | {await cor for cor in f()} # F704
| ^^^^^^^^^
32 | {await cor: 1 for cor in f()} # F704
|
F704 `await` statement outside of a function
--> F704.py:32:2
|
30 | [await cor for cor in f()] # F704
31 | {await cor for cor in f()} # F704
32 | {await cor: 1 for cor in f()} # F704
| ^^^^^^^^^
33 |
34 | # or in the iterator of an async generator, which is evaluated in the parent
|
F704 `await` statement outside of a function
--> F704.py:36:23
|
34 | # or in the iterator of an async generator, which is evaluated in the parent
35 | # scope
36 | (cor async for cor in await f()) # F704
| ^^^^^^^^^
37 | (await cor async for cor in [await c for c in f()]) # F704
|
F704 `await` statement outside of a function
--> F704.py:37:30
|
35 | # scope
36 | (cor async for cor in await f()) # F704
37 | (await cor async for cor in [await c for c in f()]) # F704
| ^^^^^^^
38 |
39 | # this is also okay because the comprehension is within the generator scope
|

View File

@@ -1,5 +1,5 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::{Token, TokenKind};
use ruff_python_ast::token::{Token, TokenKind};
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};
use crate::Locator;

View File

@@ -1,5 +1,5 @@
use ruff_python_ast::StmtImportFrom;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_text_size::{Ranged, TextRange};
use crate::Locator;

View File

@@ -1,10 +1,10 @@
use itertools::Itertools;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::Tokens;
use ruff_python_ast::whitespace::indentation;
use ruff_python_ast::{Alias, StmtImportFrom, StmtRef};
use ruff_python_codegen::Stylist;
use ruff_python_parser::Tokens;
use ruff_text_size::Ranged;
use crate::Locator;

View File

@@ -1,7 +1,7 @@
use std::slice::Iter;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_parser::{Token, TokenKind, Tokens};
use ruff_python_ast::token::{Token, TokenKind, Tokens};
use ruff_text_size::{Ranged, TextRange};
use crate::Locator;

View File

@@ -6,11 +6,11 @@ use rustc_hash::{FxHashMap, FxHashSet};
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::helpers::any_over_expr;
use ruff_python_ast::str::{leading_quote, trailing_quote};
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{self as ast, Expr, Keyword, StringFlags};
use ruff_python_literal::format::{
FieldName, FieldNamePart, FieldType, FormatPart, FormatString, FromTemplate,
};
use ruff_python_parser::TokenKind;
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextRange};

View File

@@ -3,12 +3,12 @@ use std::fmt::Write;
use std::str::FromStr;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{self as ast, AnyStringFlags, Expr, StringFlags, whitespace::indentation};
use ruff_python_codegen::Stylist;
use ruff_python_literal::cformat::{
CConversionFlags, CFormatPart, CFormatPrecision, CFormatQuantity, CFormatString,
};
use ruff_python_parser::TokenKind;
use ruff_python_stdlib::identifiers::is_identifier;
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextRange};

View File

@@ -1,6 +1,6 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::Stmt;
use ruff_python_parser::TokenKind;
use ruff_python_ast::token::TokenKind;
use ruff_python_semantic::SemanticModel;
use ruff_source_file::LineRanges;
use ruff_text_size::{TextLen, TextRange, TextSize};

View File

@@ -1,7 +1,7 @@
use anyhow::Result;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_ast::{self as ast, Expr};
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_stdlib::open_mode::OpenMode;
use ruff_text_size::{Ranged, TextSize};

View File

@@ -1,8 +1,8 @@
use std::fmt::Write as _;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_ast::{self as ast, Arguments, Expr, Keyword};
use ruff_python_parser::{TokenKind, Tokens};
use ruff_text_size::{Ranged, TextRange};
use crate::Locator;

View File

@@ -4,8 +4,8 @@ use anyhow::Result;
use libcst_native::{LeftParen, ParenthesizedNode, RightParen};
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{self as ast, Expr, OperatorPrecedence};
use ruff_python_parser::TokenKind;
use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;

View File

@@ -2,9 +2,9 @@ use std::cmp::Ordering;
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::helpers::comment_indentation_after;
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_ast::whitespace::indentation;
use ruff_python_ast::{Stmt, StmtExpr, StmtFor, StmtIf, StmtTry, StmtWhile};
use ruff_python_parser::{TokenKind, Tokens};
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};

View File

@@ -9,8 +9,8 @@ use std::cmp::Ordering;
use itertools::Itertools;
use ruff_python_ast as ast;
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_codegen::Stylist;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_stdlib::str::is_cased_uppercase;
use ruff_python_trivia::{SimpleTokenKind, first_non_trivia_token, leading_indentation};
use ruff_source_file::LineRanges;

View File

@@ -1,7 +1,7 @@
use ruff_macros::{ViolationMetadata, derive_message_formats};
use ruff_python_ast::PythonVersion;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{Expr, ExprCall, parenthesize::parenthesized_range};
use ruff_python_parser::TokenKind;
use ruff_text_size::{Ranged, TextRange};
use crate::checkers::ast::Checker;

View File

@@ -17,4 +17,6 @@ PLE1142 `await` should be used within an async function
4 | # Top-level await
5 | await 1
| ^^^^^^^
6 |
7 | ([await c for c in cor] async for cor in func()) # ok
|

File diff suppressed because it is too large Load Diff

View File

@@ -1,17 +1,46 @@
use std::sync::{LazyLock, Mutex};
use std::cell::RefCell;
use get_size2::{GetSize, StandardTracker};
use ordermap::{OrderMap, OrderSet};
thread_local! {
pub static TRACKER: RefCell<Option<StandardTracker>>= const { RefCell::new(None) };
}
struct TrackerGuard(Option<StandardTracker>);
impl Drop for TrackerGuard {
fn drop(&mut self) {
TRACKER.set(self.0.take());
}
}
pub fn attach_tracker<R>(tracker: StandardTracker, f: impl FnOnce() -> R) -> R {
let prev = TRACKER.replace(Some(tracker));
let _guard = TrackerGuard(prev);
f()
}
fn with_tracker<F, R>(f: F) -> R
where
F: FnOnce(Option<&mut StandardTracker>) -> R,
{
TRACKER.with(|tracker| {
let mut tracker = tracker.borrow_mut();
f(tracker.as_mut())
})
}
/// Returns the memory usage of the provided object, using a global tracker to avoid
/// double-counting shared objects.
pub fn heap_size<T: GetSize>(value: &T) -> usize {
static TRACKER: LazyLock<Mutex<StandardTracker>> =
LazyLock::new(|| Mutex::new(StandardTracker::new()));
value
.get_heap_size_with_tracker(&mut *TRACKER.lock().unwrap())
.0
with_tracker(|tracker| {
if let Some(tracker) = tracker {
value.get_heap_size_with_tracker(tracker).0
} else {
value.get_heap_size()
}
})
}
/// An implementation of [`GetSize::get_heap_size`] for [`OrderSet`].

View File

@@ -29,6 +29,7 @@ pub mod statement_visitor;
pub mod stmt_if;
pub mod str;
pub mod str_prefix;
pub mod token;
pub mod traversal;
pub mod types;
pub mod visitor;

View File

@@ -11,6 +11,8 @@ use crate::ExprRef;
/// Note that without a parent the range can be inaccurate, e.g. `f(a)` we falsely return a set of
/// parentheses around `a` even if the parentheses actually belong to `f`. That is why you should
/// generally prefer [`parenthesized_range`].
///
/// Prefer [`crate::token::parentheses_iterator`] if you have access to [`crate::token::Tokens`].
pub fn parentheses_iterator<'a>(
expr: ExprRef<'a>,
parent: Option<AnyNodeRef>,
@@ -57,6 +59,8 @@ pub fn parentheses_iterator<'a>(
/// Returns the [`TextRange`] of a given expression including parentheses, if the expression is
/// parenthesized; or `None`, if the expression is not parenthesized.
///
/// Prefer [`crate::token::parenthesized_range`] if you have access to [`crate::token::Tokens`].
pub fn parenthesized_range(
expr: ExprRef,
parent: AnyNodeRef,

View File

@@ -0,0 +1,853 @@
//! Token kinds for Python source code created by the lexer and consumed by the `ruff_python_parser`.
//!
//! This module defines the tokens that the lexer recognizes. The tokens are
//! loosely based on the token definitions found in the [CPython source].
//!
//! [CPython source]: https://github.com/python/cpython/blob/dfc2e065a2e71011017077e549cd2f9bf4944c54/Grammar/Tokens
use std::fmt;
use bitflags::bitflags;
use crate::str::{Quote, TripleQuotes};
use crate::str_prefix::{
AnyStringPrefix, ByteStringPrefix, FStringPrefix, StringLiteralPrefix, TStringPrefix,
};
use crate::{AnyStringFlags, BoolOp, Operator, StringFlags, UnaryOp};
use ruff_text_size::{Ranged, TextRange};
mod parentheses;
mod tokens;
pub use parentheses::{parentheses_iterator, parenthesized_range};
pub use tokens::{TokenAt, TokenIterWithContext, Tokens};
#[derive(Clone, Copy, PartialEq, Eq)]
#[cfg_attr(feature = "get-size", derive(get_size2::GetSize))]
pub struct Token {
/// The kind of the token.
kind: TokenKind,
/// The range of the token.
range: TextRange,
/// The set of flags describing this token.
flags: TokenFlags,
}
impl Token {
pub fn new(kind: TokenKind, range: TextRange, flags: TokenFlags) -> Token {
Self { kind, range, flags }
}
/// Returns the token kind.
#[inline]
pub const fn kind(&self) -> TokenKind {
self.kind
}
/// Returns the token as a tuple of (kind, range).
#[inline]
pub const fn as_tuple(&self) -> (TokenKind, TextRange) {
(self.kind, self.range)
}
/// Returns `true` if the current token is a triple-quoted string of any kind.
///
/// # Panics
///
/// If it isn't a string or any f/t-string tokens.
pub fn is_triple_quoted_string(self) -> bool {
self.unwrap_string_flags().is_triple_quoted()
}
/// Returns the [`Quote`] style for the current string token of any kind.
///
/// # Panics
///
/// If it isn't a string or any f/t-string tokens.
pub fn string_quote_style(self) -> Quote {
self.unwrap_string_flags().quote_style()
}
/// Returns the [`AnyStringFlags`] style for the current string token of any kind.
///
/// # Panics
///
/// If it isn't a string or any f/t-string tokens.
pub fn unwrap_string_flags(self) -> AnyStringFlags {
self.string_flags()
.unwrap_or_else(|| panic!("token to be a string"))
}
/// Returns true if the current token is a string and it is raw.
pub fn string_flags(self) -> Option<AnyStringFlags> {
if self.is_any_string() {
Some(self.flags.as_any_string_flags())
} else {
None
}
}
/// Returns `true` if this is any kind of string token - including
/// tokens in t-strings (which do not have type `str`).
const fn is_any_string(self) -> bool {
matches!(
self.kind,
TokenKind::String
| TokenKind::FStringStart
| TokenKind::FStringMiddle
| TokenKind::FStringEnd
| TokenKind::TStringStart
| TokenKind::TStringMiddle
| TokenKind::TStringEnd
)
}
}
impl Ranged for Token {
fn range(&self) -> TextRange {
self.range
}
}
impl fmt::Debug for Token {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{:?} {:?}", self.kind, self.range)?;
if !self.flags.is_empty() {
f.write_str(" (flags = ")?;
let mut first = true;
for (name, _) in self.flags.iter_names() {
if first {
first = false;
} else {
f.write_str(" | ")?;
}
f.write_str(name)?;
}
f.write_str(")")?;
}
Ok(())
}
}
/// A kind of a token.
#[derive(Copy, Clone, PartialEq, Eq, Hash, Debug, PartialOrd, Ord)]
#[cfg_attr(feature = "get-size", derive(get_size2::GetSize))]
pub enum TokenKind {
/// Token kind for a name, commonly known as an identifier.
Name,
/// Token kind for an integer.
Int,
/// Token kind for a floating point number.
Float,
/// Token kind for a complex number.
Complex,
/// Token kind for a string.
String,
/// Token kind for the start of an f-string. This includes the `f`/`F`/`fr` prefix
/// and the opening quote(s).
FStringStart,
/// Token kind that includes the portion of text inside the f-string that's not
/// part of the expression part and isn't an opening or closing brace.
FStringMiddle,
/// Token kind for the end of an f-string. This includes the closing quote.
FStringEnd,
/// Token kind for the start of a t-string. This includes the `t`/`T`/`tr` prefix
/// and the opening quote(s).
TStringStart,
/// Token kind that includes the portion of text inside the t-string that's not
/// part of the interpolation part and isn't an opening or closing brace.
TStringMiddle,
/// Token kind for the end of a t-string. This includes the closing quote.
TStringEnd,
/// Token kind for a IPython escape command.
IpyEscapeCommand,
/// Token kind for a comment. These are filtered out of the token stream prior to parsing.
Comment,
/// Token kind for a newline.
Newline,
/// Token kind for a newline that is not a logical line break. These are filtered out of
/// the token stream prior to parsing.
NonLogicalNewline,
/// Token kind for an indent.
Indent,
/// Token kind for a dedent.
Dedent,
EndOfFile,
/// Token kind for a question mark `?`.
Question,
/// Token kind for an exclamation mark `!`.
Exclamation,
/// Token kind for a left parenthesis `(`.
Lpar,
/// Token kind for a right parenthesis `)`.
Rpar,
/// Token kind for a left square bracket `[`.
Lsqb,
/// Token kind for a right square bracket `]`.
Rsqb,
/// Token kind for a colon `:`.
Colon,
/// Token kind for a comma `,`.
Comma,
/// Token kind for a semicolon `;`.
Semi,
/// Token kind for plus `+`.
Plus,
/// Token kind for minus `-`.
Minus,
/// Token kind for star `*`.
Star,
/// Token kind for slash `/`.
Slash,
/// Token kind for vertical bar `|`.
Vbar,
/// Token kind for ampersand `&`.
Amper,
/// Token kind for less than `<`.
Less,
/// Token kind for greater than `>`.
Greater,
/// Token kind for equal `=`.
Equal,
/// Token kind for dot `.`.
Dot,
/// Token kind for percent `%`.
Percent,
/// Token kind for left bracket `{`.
Lbrace,
/// Token kind for right bracket `}`.
Rbrace,
/// Token kind for double equal `==`.
EqEqual,
/// Token kind for not equal `!=`.
NotEqual,
/// Token kind for less than or equal `<=`.
LessEqual,
/// Token kind for greater than or equal `>=`.
GreaterEqual,
/// Token kind for tilde `~`.
Tilde,
/// Token kind for caret `^`.
CircumFlex,
/// Token kind for left shift `<<`.
LeftShift,
/// Token kind for right shift `>>`.
RightShift,
/// Token kind for double star `**`.
DoubleStar,
/// Token kind for double star equal `**=`.
DoubleStarEqual,
/// Token kind for plus equal `+=`.
PlusEqual,
/// Token kind for minus equal `-=`.
MinusEqual,
/// Token kind for star equal `*=`.
StarEqual,
/// Token kind for slash equal `/=`.
SlashEqual,
/// Token kind for percent equal `%=`.
PercentEqual,
/// Token kind for ampersand equal `&=`.
AmperEqual,
/// Token kind for vertical bar equal `|=`.
VbarEqual,
/// Token kind for caret equal `^=`.
CircumflexEqual,
/// Token kind for left shift equal `<<=`.
LeftShiftEqual,
/// Token kind for right shift equal `>>=`.
RightShiftEqual,
/// Token kind for double slash `//`.
DoubleSlash,
/// Token kind for double slash equal `//=`.
DoubleSlashEqual,
/// Token kind for colon equal `:=`.
ColonEqual,
/// Token kind for at `@`.
At,
/// Token kind for at equal `@=`.
AtEqual,
/// Token kind for arrow `->`.
Rarrow,
/// Token kind for ellipsis `...`.
Ellipsis,
// The keywords should be sorted in alphabetical order. If the boundary tokens for the
// "Keywords" and "Soft keywords" group change, update the related methods on `TokenKind`.
// Keywords
And,
As,
Assert,
Async,
Await,
Break,
Class,
Continue,
Def,
Del,
Elif,
Else,
Except,
False,
Finally,
For,
From,
Global,
If,
Import,
In,
Is,
Lambda,
None,
Nonlocal,
Not,
Or,
Pass,
Raise,
Return,
True,
Try,
While,
With,
Yield,
// Soft keywords
Case,
Match,
Type,
Unknown,
}
impl TokenKind {
/// Returns `true` if this is an end of file token.
#[inline]
pub const fn is_eof(self) -> bool {
matches!(self, TokenKind::EndOfFile)
}
/// Returns `true` if this is either a newline or non-logical newline token.
#[inline]
pub const fn is_any_newline(self) -> bool {
matches!(self, TokenKind::Newline | TokenKind::NonLogicalNewline)
}
/// Returns `true` if the token is a keyword (including soft keywords).
///
/// See also [`is_soft_keyword`], [`is_non_soft_keyword`].
///
/// [`is_soft_keyword`]: TokenKind::is_soft_keyword
/// [`is_non_soft_keyword`]: TokenKind::is_non_soft_keyword
#[inline]
pub fn is_keyword(self) -> bool {
TokenKind::And <= self && self <= TokenKind::Type
}
/// Returns `true` if the token is strictly a soft keyword.
///
/// See also [`is_keyword`], [`is_non_soft_keyword`].
///
/// [`is_keyword`]: TokenKind::is_keyword
/// [`is_non_soft_keyword`]: TokenKind::is_non_soft_keyword
#[inline]
pub fn is_soft_keyword(self) -> bool {
TokenKind::Case <= self && self <= TokenKind::Type
}
/// Returns `true` if the token is strictly a non-soft keyword.
///
/// See also [`is_keyword`], [`is_soft_keyword`].
///
/// [`is_keyword`]: TokenKind::is_keyword
/// [`is_soft_keyword`]: TokenKind::is_soft_keyword
#[inline]
pub fn is_non_soft_keyword(self) -> bool {
TokenKind::And <= self && self <= TokenKind::Yield
}
#[inline]
pub const fn is_operator(self) -> bool {
matches!(
self,
TokenKind::Lpar
| TokenKind::Rpar
| TokenKind::Lsqb
| TokenKind::Rsqb
| TokenKind::Comma
| TokenKind::Semi
| TokenKind::Plus
| TokenKind::Minus
| TokenKind::Star
| TokenKind::Slash
| TokenKind::Vbar
| TokenKind::Amper
| TokenKind::Less
| TokenKind::Greater
| TokenKind::Equal
| TokenKind::Dot
| TokenKind::Percent
| TokenKind::Lbrace
| TokenKind::Rbrace
| TokenKind::EqEqual
| TokenKind::NotEqual
| TokenKind::LessEqual
| TokenKind::GreaterEqual
| TokenKind::Tilde
| TokenKind::CircumFlex
| TokenKind::LeftShift
| TokenKind::RightShift
| TokenKind::DoubleStar
| TokenKind::PlusEqual
| TokenKind::MinusEqual
| TokenKind::StarEqual
| TokenKind::SlashEqual
| TokenKind::PercentEqual
| TokenKind::AmperEqual
| TokenKind::VbarEqual
| TokenKind::CircumflexEqual
| TokenKind::LeftShiftEqual
| TokenKind::RightShiftEqual
| TokenKind::DoubleStarEqual
| TokenKind::DoubleSlash
| TokenKind::DoubleSlashEqual
| TokenKind::At
| TokenKind::AtEqual
| TokenKind::Rarrow
| TokenKind::Ellipsis
| TokenKind::ColonEqual
| TokenKind::Colon
| TokenKind::And
| TokenKind::Or
| TokenKind::Not
| TokenKind::In
| TokenKind::Is
)
}
/// Returns `true` if this is a singleton token i.e., `True`, `False`, or `None`.
#[inline]
pub const fn is_singleton(self) -> bool {
matches!(self, TokenKind::False | TokenKind::True | TokenKind::None)
}
/// Returns `true` if this is a trivia token i.e., a comment or a non-logical newline.
#[inline]
pub const fn is_trivia(&self) -> bool {
matches!(self, TokenKind::Comment | TokenKind::NonLogicalNewline)
}
/// Returns `true` if this is a comment token.
#[inline]
pub const fn is_comment(&self) -> bool {
matches!(self, TokenKind::Comment)
}
#[inline]
pub const fn is_arithmetic(self) -> bool {
matches!(
self,
TokenKind::DoubleStar
| TokenKind::Star
| TokenKind::Plus
| TokenKind::Minus
| TokenKind::Slash
| TokenKind::DoubleSlash
| TokenKind::At
)
}
#[inline]
pub const fn is_bitwise_or_shift(self) -> bool {
matches!(
self,
TokenKind::LeftShift
| TokenKind::LeftShiftEqual
| TokenKind::RightShift
| TokenKind::RightShiftEqual
| TokenKind::Amper
| TokenKind::AmperEqual
| TokenKind::Vbar
| TokenKind::VbarEqual
| TokenKind::CircumFlex
| TokenKind::CircumflexEqual
| TokenKind::Tilde
)
}
/// Returns `true` if the current token is a unary arithmetic operator.
#[inline]
pub const fn is_unary_arithmetic_operator(self) -> bool {
matches!(self, TokenKind::Plus | TokenKind::Minus)
}
#[inline]
pub const fn is_interpolated_string_end(self) -> bool {
matches!(self, TokenKind::FStringEnd | TokenKind::TStringEnd)
}
/// Returns the [`UnaryOp`] that corresponds to this token kind, if it is a unary arithmetic
/// operator, otherwise return [None].
///
/// Use [`as_unary_operator`] to match against any unary operator.
///
/// [`as_unary_operator`]: TokenKind::as_unary_operator
#[inline]
pub const fn as_unary_arithmetic_operator(self) -> Option<UnaryOp> {
Some(match self {
TokenKind::Plus => UnaryOp::UAdd,
TokenKind::Minus => UnaryOp::USub,
_ => return None,
})
}
/// Returns the [`UnaryOp`] that corresponds to this token kind, if it is a unary operator,
/// otherwise return [None].
///
/// Use [`as_unary_arithmetic_operator`] to match against only an arithmetic unary operator.
///
/// [`as_unary_arithmetic_operator`]: TokenKind::as_unary_arithmetic_operator
#[inline]
pub const fn as_unary_operator(self) -> Option<UnaryOp> {
Some(match self {
TokenKind::Plus => UnaryOp::UAdd,
TokenKind::Minus => UnaryOp::USub,
TokenKind::Tilde => UnaryOp::Invert,
TokenKind::Not => UnaryOp::Not,
_ => return None,
})
}
/// Returns the [`BoolOp`] that corresponds to this token kind, if it is a boolean operator,
/// otherwise return [None].
#[inline]
pub const fn as_bool_operator(self) -> Option<BoolOp> {
Some(match self {
TokenKind::And => BoolOp::And,
TokenKind::Or => BoolOp::Or,
_ => return None,
})
}
/// Returns the binary [`Operator`] that corresponds to the current token, if it's a binary
/// operator, otherwise return [None].
///
/// Use [`as_augmented_assign_operator`] to match against an augmented assignment token.
///
/// [`as_augmented_assign_operator`]: TokenKind::as_augmented_assign_operator
pub const fn as_binary_operator(self) -> Option<Operator> {
Some(match self {
TokenKind::Plus => Operator::Add,
TokenKind::Minus => Operator::Sub,
TokenKind::Star => Operator::Mult,
TokenKind::At => Operator::MatMult,
TokenKind::DoubleStar => Operator::Pow,
TokenKind::Slash => Operator::Div,
TokenKind::DoubleSlash => Operator::FloorDiv,
TokenKind::Percent => Operator::Mod,
TokenKind::Amper => Operator::BitAnd,
TokenKind::Vbar => Operator::BitOr,
TokenKind::CircumFlex => Operator::BitXor,
TokenKind::LeftShift => Operator::LShift,
TokenKind::RightShift => Operator::RShift,
_ => return None,
})
}
/// Returns the [`Operator`] that corresponds to this token kind, if it is
/// an augmented assignment operator, or [`None`] otherwise.
#[inline]
pub const fn as_augmented_assign_operator(self) -> Option<Operator> {
Some(match self {
TokenKind::PlusEqual => Operator::Add,
TokenKind::MinusEqual => Operator::Sub,
TokenKind::StarEqual => Operator::Mult,
TokenKind::AtEqual => Operator::MatMult,
TokenKind::DoubleStarEqual => Operator::Pow,
TokenKind::SlashEqual => Operator::Div,
TokenKind::DoubleSlashEqual => Operator::FloorDiv,
TokenKind::PercentEqual => Operator::Mod,
TokenKind::AmperEqual => Operator::BitAnd,
TokenKind::VbarEqual => Operator::BitOr,
TokenKind::CircumflexEqual => Operator::BitXor,
TokenKind::LeftShiftEqual => Operator::LShift,
TokenKind::RightShiftEqual => Operator::RShift,
_ => return None,
})
}
}
impl From<BoolOp> for TokenKind {
#[inline]
fn from(op: BoolOp) -> Self {
match op {
BoolOp::And => TokenKind::And,
BoolOp::Or => TokenKind::Or,
}
}
}
impl From<UnaryOp> for TokenKind {
#[inline]
fn from(op: UnaryOp) -> Self {
match op {
UnaryOp::Invert => TokenKind::Tilde,
UnaryOp::Not => TokenKind::Not,
UnaryOp::UAdd => TokenKind::Plus,
UnaryOp::USub => TokenKind::Minus,
}
}
}
impl From<Operator> for TokenKind {
#[inline]
fn from(op: Operator) -> Self {
match op {
Operator::Add => TokenKind::Plus,
Operator::Sub => TokenKind::Minus,
Operator::Mult => TokenKind::Star,
Operator::MatMult => TokenKind::At,
Operator::Div => TokenKind::Slash,
Operator::Mod => TokenKind::Percent,
Operator::Pow => TokenKind::DoubleStar,
Operator::LShift => TokenKind::LeftShift,
Operator::RShift => TokenKind::RightShift,
Operator::BitOr => TokenKind::Vbar,
Operator::BitXor => TokenKind::CircumFlex,
Operator::BitAnd => TokenKind::Amper,
Operator::FloorDiv => TokenKind::DoubleSlash,
}
}
}
impl fmt::Display for TokenKind {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let value = match self {
TokenKind::Unknown => "Unknown",
TokenKind::Newline => "newline",
TokenKind::NonLogicalNewline => "NonLogicalNewline",
TokenKind::Indent => "indent",
TokenKind::Dedent => "dedent",
TokenKind::EndOfFile => "end of file",
TokenKind::Name => "name",
TokenKind::Int => "int",
TokenKind::Float => "float",
TokenKind::Complex => "complex",
TokenKind::String => "string",
TokenKind::FStringStart => "FStringStart",
TokenKind::FStringMiddle => "FStringMiddle",
TokenKind::FStringEnd => "FStringEnd",
TokenKind::TStringStart => "TStringStart",
TokenKind::TStringMiddle => "TStringMiddle",
TokenKind::TStringEnd => "TStringEnd",
TokenKind::IpyEscapeCommand => "IPython escape command",
TokenKind::Comment => "comment",
TokenKind::Question => "`?`",
TokenKind::Exclamation => "`!`",
TokenKind::Lpar => "`(`",
TokenKind::Rpar => "`)`",
TokenKind::Lsqb => "`[`",
TokenKind::Rsqb => "`]`",
TokenKind::Lbrace => "`{`",
TokenKind::Rbrace => "`}`",
TokenKind::Equal => "`=`",
TokenKind::ColonEqual => "`:=`",
TokenKind::Dot => "`.`",
TokenKind::Colon => "`:`",
TokenKind::Semi => "`;`",
TokenKind::Comma => "`,`",
TokenKind::Rarrow => "`->`",
TokenKind::Plus => "`+`",
TokenKind::Minus => "`-`",
TokenKind::Star => "`*`",
TokenKind::DoubleStar => "`**`",
TokenKind::Slash => "`/`",
TokenKind::DoubleSlash => "`//`",
TokenKind::Percent => "`%`",
TokenKind::Vbar => "`|`",
TokenKind::Amper => "`&`",
TokenKind::CircumFlex => "`^`",
TokenKind::LeftShift => "`<<`",
TokenKind::RightShift => "`>>`",
TokenKind::Tilde => "`~`",
TokenKind::At => "`@`",
TokenKind::Less => "`<`",
TokenKind::Greater => "`>`",
TokenKind::EqEqual => "`==`",
TokenKind::NotEqual => "`!=`",
TokenKind::LessEqual => "`<=`",
TokenKind::GreaterEqual => "`>=`",
TokenKind::PlusEqual => "`+=`",
TokenKind::MinusEqual => "`-=`",
TokenKind::StarEqual => "`*=`",
TokenKind::DoubleStarEqual => "`**=`",
TokenKind::SlashEqual => "`/=`",
TokenKind::DoubleSlashEqual => "`//=`",
TokenKind::PercentEqual => "`%=`",
TokenKind::VbarEqual => "`|=`",
TokenKind::AmperEqual => "`&=`",
TokenKind::CircumflexEqual => "`^=`",
TokenKind::LeftShiftEqual => "`<<=`",
TokenKind::RightShiftEqual => "`>>=`",
TokenKind::AtEqual => "`@=`",
TokenKind::Ellipsis => "`...`",
TokenKind::False => "`False`",
TokenKind::None => "`None`",
TokenKind::True => "`True`",
TokenKind::And => "`and`",
TokenKind::As => "`as`",
TokenKind::Assert => "`assert`",
TokenKind::Async => "`async`",
TokenKind::Await => "`await`",
TokenKind::Break => "`break`",
TokenKind::Class => "`class`",
TokenKind::Continue => "`continue`",
TokenKind::Def => "`def`",
TokenKind::Del => "`del`",
TokenKind::Elif => "`elif`",
TokenKind::Else => "`else`",
TokenKind::Except => "`except`",
TokenKind::Finally => "`finally`",
TokenKind::For => "`for`",
TokenKind::From => "`from`",
TokenKind::Global => "`global`",
TokenKind::If => "`if`",
TokenKind::Import => "`import`",
TokenKind::In => "`in`",
TokenKind::Is => "`is`",
TokenKind::Lambda => "`lambda`",
TokenKind::Nonlocal => "`nonlocal`",
TokenKind::Not => "`not`",
TokenKind::Or => "`or`",
TokenKind::Pass => "`pass`",
TokenKind::Raise => "`raise`",
TokenKind::Return => "`return`",
TokenKind::Try => "`try`",
TokenKind::While => "`while`",
TokenKind::Match => "`match`",
TokenKind::Type => "`type`",
TokenKind::Case => "`case`",
TokenKind::With => "`with`",
TokenKind::Yield => "`yield`",
};
f.write_str(value)
}
}
bitflags! {
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct TokenFlags: u16 {
/// The token is a string with double quotes (`"`).
const DOUBLE_QUOTES = 1 << 0;
/// The token is a triple-quoted string i.e., it starts and ends with three consecutive
/// quote characters (`"""` or `'''`).
const TRIPLE_QUOTED_STRING = 1 << 1;
/// The token is a unicode string i.e., prefixed with `u` or `U`
const UNICODE_STRING = 1 << 2;
/// The token is a byte string i.e., prefixed with `b` or `B`
const BYTE_STRING = 1 << 3;
/// The token is an f-string i.e., prefixed with `f` or `F`
const F_STRING = 1 << 4;
/// The token is a t-string i.e., prefixed with `t` or `T`
const T_STRING = 1 << 5;
/// The token is a raw string and the prefix character is in lowercase.
const RAW_STRING_LOWERCASE = 1 << 6;
/// The token is a raw string and the prefix character is in uppercase.
const RAW_STRING_UPPERCASE = 1 << 7;
/// String without matching closing quote(s)
const UNCLOSED_STRING = 1 << 8;
/// The token is a raw string i.e., prefixed with `r` or `R`
const RAW_STRING = Self::RAW_STRING_LOWERCASE.bits() | Self::RAW_STRING_UPPERCASE.bits();
}
}
#[cfg(feature = "get-size")]
impl get_size2::GetSize for TokenFlags {}
impl StringFlags for TokenFlags {
fn quote_style(self) -> Quote {
if self.intersects(TokenFlags::DOUBLE_QUOTES) {
Quote::Double
} else {
Quote::Single
}
}
fn triple_quotes(self) -> TripleQuotes {
if self.intersects(TokenFlags::TRIPLE_QUOTED_STRING) {
TripleQuotes::Yes
} else {
TripleQuotes::No
}
}
fn prefix(self) -> AnyStringPrefix {
if self.intersects(TokenFlags::F_STRING) {
if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Format(FStringPrefix::Raw { uppercase_r: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Format(FStringPrefix::Raw { uppercase_r: true })
} else {
AnyStringPrefix::Format(FStringPrefix::Regular)
}
} else if self.intersects(TokenFlags::T_STRING) {
if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Template(TStringPrefix::Raw { uppercase_r: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Template(TStringPrefix::Raw { uppercase_r: true })
} else {
AnyStringPrefix::Template(TStringPrefix::Regular)
}
} else if self.intersects(TokenFlags::BYTE_STRING) {
if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Bytes(ByteStringPrefix::Raw { uppercase_r: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Bytes(ByteStringPrefix::Raw { uppercase_r: true })
} else {
AnyStringPrefix::Bytes(ByteStringPrefix::Regular)
}
} else if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Regular(StringLiteralPrefix::Raw { uppercase: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Regular(StringLiteralPrefix::Raw { uppercase: true })
} else if self.intersects(TokenFlags::UNICODE_STRING) {
AnyStringPrefix::Regular(StringLiteralPrefix::Unicode)
} else {
AnyStringPrefix::Regular(StringLiteralPrefix::Empty)
}
}
fn is_unclosed(self) -> bool {
self.intersects(TokenFlags::UNCLOSED_STRING)
}
}
impl TokenFlags {
/// Returns `true` if the token is an f-string.
pub const fn is_f_string(self) -> bool {
self.intersects(TokenFlags::F_STRING)
}
/// Returns `true` if the token is a t-string.
pub const fn is_t_string(self) -> bool {
self.intersects(TokenFlags::T_STRING)
}
/// Returns `true` if the token is a t-string.
pub const fn is_interpolated_string(self) -> bool {
self.intersects(TokenFlags::T_STRING.union(TokenFlags::F_STRING))
}
/// Returns `true` if the token is a triple-quoted t-string.
pub fn is_triple_quoted_interpolated_string(self) -> bool {
self.intersects(TokenFlags::TRIPLE_QUOTED_STRING) && self.is_interpolated_string()
}
/// Returns `true` if the token is a raw string.
pub const fn is_raw_string(self) -> bool {
self.intersects(TokenFlags::RAW_STRING)
}
}

View File

@@ -0,0 +1,58 @@
use ruff_text_size::{Ranged, TextLen, TextRange};
use super::{TokenKind, Tokens};
use crate::{AnyNodeRef, ExprRef};
/// Returns an iterator over the ranges of the optional parentheses surrounding an expression.
///
/// E.g. for `((f()))` with `f()` as expression, the iterator returns the ranges (1, 6) and (0, 7).
///
/// Note that without a parent the range can be inaccurate, e.g. `f(a)` we falsely return a set of
/// parentheses around `a` even if the parentheses actually belong to `f`. That is why you should
/// generally prefer [`parenthesized_range`].
pub fn parentheses_iterator<'a>(
expr: ExprRef<'a>,
parent: Option<AnyNodeRef>,
tokens: &'a Tokens,
) -> impl Iterator<Item = TextRange> + 'a {
let after_tokens = if let Some(parent) = parent {
// If the parent is a node that brings its own parentheses, exclude the closing parenthesis
// from our search range. Otherwise, we risk matching on calls, like `func(x)`, for which
// the open and close parentheses are part of the `Arguments` node.
let exclusive_parent_end = if parent.is_arguments() {
parent.end() - ")".text_len()
} else {
parent.end()
};
tokens.in_range(TextRange::new(expr.end(), exclusive_parent_end))
} else {
tokens.after(expr.end())
};
let right_parens = after_tokens
.iter()
.filter(|token| !token.kind().is_trivia())
.take_while(move |token| token.kind() == TokenKind::Rpar);
let left_parens = tokens
.before(expr.start())
.iter()
.rev()
.filter(|token| !token.kind().is_trivia())
.take_while(|token| token.kind() == TokenKind::Lpar);
right_parens
.zip(left_parens)
.map(|(right, left)| TextRange::new(left.start(), right.end()))
}
/// Returns the [`TextRange`] of a given expression including parentheses, if the expression is
/// parenthesized; or `None`, if the expression is not parenthesized.
pub fn parenthesized_range(
expr: ExprRef,
parent: AnyNodeRef,
tokens: &Tokens,
) -> Option<TextRange> {
parentheses_iterator(expr, Some(parent), tokens).last()
}

View File

@@ -0,0 +1,520 @@
use std::{iter::FusedIterator, ops::Deref};
use super::{Token, TokenKind};
use ruff_python_trivia::CommentRanges;
use ruff_text_size::{Ranged as _, TextRange, TextSize};
/// Tokens represents a vector of lexed [`Token`].
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(feature = "get-size", derive(get_size2::GetSize))]
pub struct Tokens {
raw: Vec<Token>,
}
impl Tokens {
pub fn new(tokens: Vec<Token>) -> Tokens {
Tokens { raw: tokens }
}
/// Returns an iterator over all the tokens that provides context.
pub fn iter_with_context(&self) -> TokenIterWithContext<'_> {
TokenIterWithContext::new(&self.raw)
}
/// Performs a binary search to find the index of the **first** token that starts at the given `offset`.
///
/// Unlike `binary_search_by_key`, this method ensures that if multiple tokens start at the same offset,
/// it returns the index of the first one. Multiple tokens can start at the same offset in cases where
/// zero-length tokens are involved (like `Dedent` or `Newline` at the end of the file).
pub fn binary_search_by_start(&self, offset: TextSize) -> Result<usize, usize> {
let partition_point = self.partition_point(|token| token.start() < offset);
let after = &self[partition_point..];
if after.first().is_some_and(|first| first.start() == offset) {
Ok(partition_point)
} else {
Err(partition_point)
}
}
/// Returns a slice of [`Token`] that are within the given `range`.
///
/// The start and end offset of the given range should be either:
/// 1. Token boundary
/// 2. Gap between the tokens
///
/// For example, considering the following tokens and their corresponding range:
///
/// | Token | Range |
/// |---------------------|-----------|
/// | `Def` | `0..3` |
/// | `Name` | `4..7` |
/// | `Lpar` | `7..8` |
/// | `Rpar` | `8..9` |
/// | `Colon` | `9..10` |
/// | `Newline` | `10..11` |
/// | `Comment` | `15..24` |
/// | `NonLogicalNewline` | `24..25` |
/// | `Indent` | `25..29` |
/// | `Pass` | `29..33` |
///
/// Here, for (1) a token boundary is considered either the start or end offset of any of the
/// above tokens. For (2), the gap would be any offset between the `Newline` and `Comment`
/// token which are 12, 13, and 14.
///
/// Examples:
/// 1) `4..10` would give `Name`, `Lpar`, `Rpar`, `Colon`
/// 2) `11..25` would give `Comment`, `NonLogicalNewline`
/// 3) `12..25` would give same as (2) and offset 12 is in the "gap"
/// 4) `9..12` would give `Colon`, `Newline` and offset 12 is in the "gap"
/// 5) `18..27` would panic because both the start and end offset is within a token
///
/// ## Note
///
/// The returned slice can contain the [`TokenKind::Unknown`] token if there was a lexical
/// error encountered within the given range.
///
/// # Panics
///
/// If either the start or end offset of the given range is within a token range.
pub fn in_range(&self, range: TextRange) -> &[Token] {
let tokens_after_start = self.after(range.start());
Self::before_impl(tokens_after_start, range.end())
}
/// Searches the token(s) at `offset`.
///
/// Returns [`TokenAt::Between`] if `offset` points directly inbetween two tokens
/// (the left token ends at `offset` and the right token starts at `offset`).
pub fn at_offset(&self, offset: TextSize) -> TokenAt {
match self.binary_search_by_start(offset) {
// The token at `index` starts exactly at `offset.
// ```python
// object.attribute
// ^ OFFSET
// ```
Ok(index) => {
let token = self[index];
// `token` starts exactly at `offset`. Test if the offset is right between
// `token` and the previous token (if there's any)
if let Some(previous) = index.checked_sub(1).map(|idx| self[idx]) {
if previous.end() == offset {
return TokenAt::Between(previous, token);
}
}
TokenAt::Single(token)
}
// No token found that starts exactly at the given offset. But it's possible that
// the token starting before `offset` fully encloses `offset` (it's end range ends after `offset`).
// ```python
// object.attribute
// ^ OFFSET
// # or
// if True:
// print("test")
// ^ OFFSET
// ```
Err(index) => {
if let Some(previous) = index.checked_sub(1).map(|idx| self[idx]) {
if previous.range().contains_inclusive(offset) {
return TokenAt::Single(previous);
}
}
TokenAt::None
}
}
}
/// Returns a slice of tokens before the given [`TextSize`] offset.
///
/// If the given offset is between two tokens, the returned slice will end just before the
/// following token. In other words, if the offset is between the end of previous token and
/// start of next token, the returned slice will end just before the next token.
///
/// # Panics
///
/// If the given offset is inside a token range at any point
/// other than the start of the range.
pub fn before(&self, offset: TextSize) -> &[Token] {
Self::before_impl(&self.raw, offset)
}
fn before_impl(tokens: &[Token], offset: TextSize) -> &[Token] {
let partition_point = tokens.partition_point(|token| token.start() < offset);
let before = &tokens[..partition_point];
if let Some(last) = before.last() {
// If it's equal to the end offset, then it's at a token boundary which is
// valid. If it's greater than the end offset, then it's in the gap between
// the tokens which is valid as well.
assert!(
offset >= last.end(),
"Offset {:?} is inside a token range {:?}",
offset,
last.range()
);
}
before
}
/// Returns a slice of tokens after the given [`TextSize`] offset.
///
/// If the given offset is between two tokens, the returned slice will start from the following
/// token. In other words, if the offset is between the end of previous token and start of next
/// token, the returned slice will start from the next token.
///
/// # Panics
///
/// If the given offset is inside a token range at any point
/// other than the start of the range.
pub fn after(&self, offset: TextSize) -> &[Token] {
let partition_point = self.partition_point(|token| token.end() <= offset);
let after = &self[partition_point..];
if let Some(first) = after.first() {
// valid. If it's greater than the end offset, then it's in the gap between
// the tokens which is valid as well.
assert!(
offset <= first.start(),
"Offset {:?} is inside a token range {:?}",
offset,
first.range()
);
}
after
}
}
impl<'a> IntoIterator for &'a Tokens {
type Item = &'a Token;
type IntoIter = std::slice::Iter<'a, Token>;
fn into_iter(self) -> Self::IntoIter {
self.iter()
}
}
impl Deref for Tokens {
type Target = [Token];
fn deref(&self) -> &Self::Target {
&self.raw
}
}
/// A token that encloses a given offset or ends exactly at it.
#[derive(Debug, Clone)]
pub enum TokenAt {
/// There's no token at the given offset
None,
/// There's a single token at the given offset.
Single(Token),
/// The offset falls exactly between two tokens. E.g. `CURSOR` in `call<CURSOR>(arguments)` is
/// positioned exactly between the `call` and `(` tokens.
Between(Token, Token),
}
impl Iterator for TokenAt {
type Item = Token;
fn next(&mut self) -> Option<Self::Item> {
match *self {
TokenAt::None => None,
TokenAt::Single(token) => {
*self = TokenAt::None;
Some(token)
}
TokenAt::Between(first, second) => {
*self = TokenAt::Single(second);
Some(first)
}
}
}
}
impl FusedIterator for TokenAt {}
impl From<&Tokens> for CommentRanges {
fn from(tokens: &Tokens) -> Self {
let mut ranges = vec![];
for token in tokens {
if token.kind() == TokenKind::Comment {
ranges.push(token.range());
}
}
CommentRanges::new(ranges)
}
}
/// An iterator over the [`Token`]s with context.
///
/// This struct is created by the [`iter_with_context`] method on [`Tokens`]. Refer to its
/// documentation for more details.
///
/// [`iter_with_context`]: Tokens::iter_with_context
#[derive(Debug, Clone)]
pub struct TokenIterWithContext<'a> {
inner: std::slice::Iter<'a, Token>,
nesting: u32,
}
impl<'a> TokenIterWithContext<'a> {
fn new(tokens: &'a [Token]) -> TokenIterWithContext<'a> {
TokenIterWithContext {
inner: tokens.iter(),
nesting: 0,
}
}
/// Return the nesting level the iterator is currently in.
pub const fn nesting(&self) -> u32 {
self.nesting
}
/// Returns `true` if the iterator is within a parenthesized context.
pub const fn in_parenthesized_context(&self) -> bool {
self.nesting > 0
}
/// Returns the next [`Token`] in the iterator without consuming it.
pub fn peek(&self) -> Option<&'a Token> {
self.clone().next()
}
}
impl<'a> Iterator for TokenIterWithContext<'a> {
type Item = &'a Token;
fn next(&mut self) -> Option<Self::Item> {
let token = self.inner.next()?;
match token.kind() {
TokenKind::Lpar | TokenKind::Lbrace | TokenKind::Lsqb => self.nesting += 1,
TokenKind::Rpar | TokenKind::Rbrace | TokenKind::Rsqb => {
self.nesting = self.nesting.saturating_sub(1);
}
// This mimics the behavior of re-lexing which reduces the nesting level on the lexer.
// We don't need to reduce it by 1 because unlike the lexer we see the final token
// after recovering from every unclosed parenthesis.
TokenKind::Newline if self.nesting > 0 => {
self.nesting = 0;
}
_ => {}
}
Some(token)
}
}
impl FusedIterator for TokenIterWithContext<'_> {}
#[cfg(test)]
mod tests {
use std::ops::Range;
use ruff_text_size::TextSize;
use crate::token::{Token, TokenFlags, TokenKind};
use super::*;
/// Test case containing a "gap" between two tokens.
///
/// Code: <https://play.ruff.rs/a3658340-6df8-42c5-be80-178744bf1193>
const TEST_CASE_WITH_GAP: [(TokenKind, Range<u32>); 10] = [
(TokenKind::Def, 0..3),
(TokenKind::Name, 4..7),
(TokenKind::Lpar, 7..8),
(TokenKind::Rpar, 8..9),
(TokenKind::Colon, 9..10),
(TokenKind::Newline, 10..11),
// Gap ||..||
(TokenKind::Comment, 15..24),
(TokenKind::NonLogicalNewline, 24..25),
(TokenKind::Indent, 25..29),
(TokenKind::Pass, 29..33),
// No newline at the end to keep the token set full of unique tokens
];
/// Helper function to create [`Tokens`] from an iterator of (kind, range).
fn new_tokens(tokens: impl Iterator<Item = (TokenKind, Range<u32>)>) -> Tokens {
Tokens::new(
tokens
.map(|(kind, range)| {
Token::new(
kind,
TextRange::new(TextSize::new(range.start), TextSize::new(range.end)),
TokenFlags::empty(),
)
})
.collect(),
)
}
#[test]
fn tokens_after_offset_at_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(8));
assert_eq!(after.len(), 7);
assert_eq!(after.first().unwrap().kind(), TokenKind::Rpar);
}
#[test]
fn tokens_after_offset_at_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(11));
assert_eq!(after.len(), 4);
assert_eq!(after.first().unwrap().kind(), TokenKind::Comment);
}
#[test]
fn tokens_after_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(13));
assert_eq!(after.len(), 4);
assert_eq!(after.first().unwrap().kind(), TokenKind::Comment);
}
#[test]
fn tokens_after_offset_at_last_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(33));
assert_eq!(after.len(), 0);
}
#[test]
#[should_panic(expected = "Offset 5 is inside a token range 4..7")]
fn tokens_after_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.after(TextSize::new(5));
}
#[test]
fn tokens_before_offset_at_first_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(0));
assert_eq!(before.len(), 0);
}
#[test]
fn tokens_before_offset_after_first_token_gap() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(3));
assert_eq!(before.len(), 1);
assert_eq!(before.last().unwrap().kind(), TokenKind::Def);
}
#[test]
fn tokens_before_offset_at_second_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(4));
assert_eq!(before.len(), 1);
assert_eq!(before.last().unwrap().kind(), TokenKind::Def);
}
#[test]
fn tokens_before_offset_at_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(8));
assert_eq!(before.len(), 3);
assert_eq!(before.last().unwrap().kind(), TokenKind::Lpar);
}
#[test]
fn tokens_before_offset_at_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(11));
assert_eq!(before.len(), 6);
assert_eq!(before.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
fn tokens_before_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(13));
assert_eq!(before.len(), 6);
assert_eq!(before.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
fn tokens_before_offset_at_last_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(33));
assert_eq!(before.len(), 10);
assert_eq!(before.last().unwrap().kind(), TokenKind::Pass);
}
#[test]
#[should_panic(expected = "Offset 5 is inside a token range 4..7")]
fn tokens_before_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.before(TextSize::new(5));
}
#[test]
fn tokens_in_range_at_token_offset() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(4.into(), 10.into()));
assert_eq!(in_range.len(), 4);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Name);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Colon);
}
#[test]
fn tokens_in_range_start_offset_at_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(11.into(), 29.into()));
assert_eq!(in_range.len(), 3);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Comment);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Indent);
}
#[test]
fn tokens_in_range_end_offset_at_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(8.into(), 15.into()));
assert_eq!(in_range.len(), 3);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Rpar);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
fn tokens_in_range_start_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(13.into(), 29.into()));
assert_eq!(in_range.len(), 3);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Comment);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Indent);
}
#[test]
fn tokens_in_range_end_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(9.into(), 13.into()));
assert_eq!(in_range.len(), 2);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Colon);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
#[should_panic(expected = "Offset 5 is inside a token range 4..7")]
fn tokens_in_range_start_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.in_range(TextRange::new(5.into(), 10.into()));
}
#[test]
#[should_panic(expected = "Offset 6 is inside a token range 4..7")]
fn tokens_in_range_end_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.in_range(TextRange::new(0.into(), 6.into()));
}
}

View File

@@ -0,0 +1,199 @@
//! Tests for [`ruff_python_ast::tokens::parentheses_iterator`] and
//! [`ruff_python_ast::tokens::parenthesized_range`].
use ruff_python_ast::{
self as ast, Expr,
token::{parentheses_iterator, parenthesized_range},
};
use ruff_python_parser::parse_module;
#[test]
fn test_no_parentheses() {
let source = "x = 2 + 2";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let result = parenthesized_range(assign.value.as_ref().into(), stmt.into(), tokens);
assert_eq!(result, None);
}
#[test]
fn test_single_parentheses() {
let source = "x = (2 + 2)";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let result = parenthesized_range(assign.value.as_ref().into(), stmt.into(), tokens);
let range = result.expect("should find parentheses");
assert_eq!(&source[range], "(2 + 2)");
}
#[test]
fn test_double_parentheses() {
let source = "x = ((2 + 2))";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let result = parenthesized_range(assign.value.as_ref().into(), stmt.into(), tokens);
let range = result.expect("should find parentheses");
assert_eq!(&source[range], "((2 + 2))");
}
#[test]
fn test_parentheses_with_whitespace() {
let source = "x = ( 2 + 2 )";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let result = parenthesized_range(assign.value.as_ref().into(), stmt.into(), tokens);
let range = result.expect("should find parentheses");
assert_eq!(&source[range], "( 2 + 2 )");
}
#[test]
fn test_parentheses_with_comments() {
let source = "x = ( # comment\n 2 + 2\n)";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let result = parenthesized_range(assign.value.as_ref().into(), stmt.into(), tokens);
let range = result.expect("should find parentheses");
assert_eq!(&source[range], "( # comment\n 2 + 2\n)");
}
#[test]
fn test_parenthesized_range_multiple() {
let source = "x = (((2 + 2)))";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let result = parenthesized_range(assign.value.as_ref().into(), stmt.into(), tokens);
let range = result.expect("should find parentheses");
assert_eq!(&source[range], "(((2 + 2)))");
}
#[test]
fn test_parentheses_iterator_multiple() {
let source = "x = (((2 + 2)))";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let ranges: Vec<_> =
parentheses_iterator(assign.value.as_ref().into(), Some(stmt.into()), tokens).collect();
assert_eq!(ranges.len(), 3);
assert_eq!(&source[ranges[0]], "(2 + 2)");
assert_eq!(&source[ranges[1]], "((2 + 2))");
assert_eq!(&source[ranges[2]], "(((2 + 2)))");
}
#[test]
fn test_call_arguments_not_counted() {
let source = "f(x)";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Expr(expr_stmt) = stmt else {
panic!("expected `Expr` statement, got {stmt:?}");
};
let Expr::Call(call) = expr_stmt.value.as_ref() else {
panic!("expected Call expression, got {:?}", expr_stmt.value);
};
let arg = call
.arguments
.args
.first()
.expect("call should have an argument");
let result = parenthesized_range(arg.into(), (&call.arguments).into(), tokens);
// The parentheses belong to the call, not the argument
assert_eq!(result, None);
}
#[test]
fn test_call_with_parenthesized_argument() {
let source = "f((x))";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Expr(expr_stmt) = stmt else {
panic!("expected Expr statement, got {stmt:?}");
};
let Expr::Call(call) = expr_stmt.value.as_ref() else {
panic!("expected `Call` expression, got {:?}", expr_stmt.value);
};
let arg = call
.arguments
.args
.first()
.expect("call should have an argument");
let result = parenthesized_range(arg.into(), (&call.arguments).into(), tokens);
let range = result.expect("should find parentheses around argument");
assert_eq!(&source[range], "(x)");
}
#[test]
fn test_multiline_with_parentheses() {
let source = "x = (\n 2 + 2 + 2\n)";
let parsed = parse_module(source).expect("should parse valid python");
let tokens = parsed.tokens();
let module = parsed.syntax();
let stmt = module.body.first().expect("module should have a statement");
let ast::Stmt::Assign(assign) = stmt else {
panic!("expected `Assign` statement, got {stmt:?}");
};
let result = parenthesized_range(assign.value.as_ref().into(), stmt.into(), tokens);
let range = result.expect("should find parentheses");
assert_eq!(&source[range], "(\n 2 + 2 + 2\n)");
}

View File

@@ -5,7 +5,7 @@ use std::cell::OnceCell;
use std::ops::Deref;
use ruff_python_ast::str::Quote;
use ruff_python_parser::{Token, TokenKind, Tokens};
use ruff_python_ast::token::{Token, TokenKind, Tokens};
use ruff_source_file::{LineEnding, LineRanges, find_newline};
use ruff_text_size::Ranged;

View File

@@ -3,7 +3,7 @@ use std::ops::{Deref, DerefMut};
use ruff_formatter::{Buffer, FormatContext, GroupId, IndentWidth, SourceCode};
use ruff_python_ast::str::Quote;
use ruff_python_parser::Tokens;
use ruff_python_ast::token::Tokens;
use crate::PyFormatOptions;
use crate::comments::Comments;

View File

@@ -5,7 +5,7 @@ use std::slice::Iter;
use ruff_formatter::{FormatError, write};
use ruff_python_ast::AnyNodeRef;
use ruff_python_ast::Stmt;
use ruff_python_parser::{self as parser, TokenKind};
use ruff_python_ast::token::{Token as AstToken, TokenKind};
use ruff_python_trivia::lines_before;
use ruff_source_file::LineRanges;
use ruff_text_size::{Ranged, TextRange, TextSize};
@@ -770,7 +770,7 @@ impl Format<PyFormatContext<'_>> for FormatVerbatimStatementRange {
}
struct LogicalLinesIter<'a> {
tokens: Iter<'a, parser::Token>,
tokens: Iter<'a, AstToken>,
// The end of the last logical line
last_line_end: TextSize,
// The position where the content to lex ends.
@@ -778,7 +778,7 @@ struct LogicalLinesIter<'a> {
}
impl<'a> LogicalLinesIter<'a> {
fn new(tokens: Iter<'a, parser::Token>, verbatim_range: TextRange) -> Self {
fn new(tokens: Iter<'a, AstToken>, verbatim_range: TextRange) -> Self {
Self {
tokens,
last_line_end: verbatim_range.start(),

View File

@@ -14,7 +14,6 @@ license = { workspace = true }
ruff_diagnostics = { workspace = true }
ruff_python_ast = { workspace = true }
ruff_python_codegen = { workspace = true }
ruff_python_parser = { workspace = true }
ruff_python_trivia = { workspace = true }
ruff_source_file = { workspace = true, features = ["serde"] }
ruff_text_size = { workspace = true }
@@ -22,6 +21,8 @@ ruff_text_size = { workspace = true }
anyhow = { workspace = true }
[dev-dependencies]
ruff_python_parser = { workspace = true }
insta = { workspace = true }
[features]

View File

@@ -5,8 +5,8 @@ use std::ops::Add;
use ruff_diagnostics::Edit;
use ruff_python_ast::Stmt;
use ruff_python_ast::helpers::is_docstring_stmt;
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_codegen::Stylist;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_trivia::is_python_whitespace;
use ruff_python_trivia::{PythonWhitespace, textwrap::indent};
use ruff_source_file::{LineRanges, UniversalNewlineIterator};
@@ -194,7 +194,7 @@ impl<'a> Insertion<'a> {
tokens
.before(at)
.last()
.map(ruff_python_parser::Token::kind),
.map(ruff_python_ast::token::Token::kind),
Some(TokenKind::Import)
) {
return None;

View File

@@ -15,12 +15,12 @@ doctest = false
[dependencies]
ruff_python_ast = { workspace = true }
ruff_python_parser = { workspace = true }
ruff_python_trivia = { workspace = true }
ruff_source_file = { workspace = true }
ruff_text_size = { workspace = true }
[dev-dependencies]
ruff_python_parser = { workspace = true }
[lints]
workspace = true

View File

@@ -2,7 +2,7 @@
//! are omitted from the AST (e.g., commented lines).
use ruff_python_ast::Stmt;
use ruff_python_parser::{TokenKind, Tokens};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_trivia::{
CommentRanges, has_leading_content, has_trailing_content, is_python_whitespace,
};

View File

@@ -1,6 +1,6 @@
use std::collections::BTreeMap;
use ruff_python_parser::{Token, TokenKind};
use ruff_python_ast::token::{Token, TokenKind};
use ruff_text_size::{Ranged, TextRange, TextSize};
/// Stores the ranges of all interpolated strings in a file sorted by [`TextRange::start`].

View File

@@ -1,4 +1,4 @@
use ruff_python_parser::{Token, TokenKind};
use ruff_python_ast::token::{Token, TokenKind};
use ruff_text_size::{Ranged, TextRange};
/// Stores the range of all multiline strings in a file sorted by

View File

@@ -0,0 +1,3 @@
class C[T = int, U]: ...
class C[T1, T2 = int, T3, T4]: ...
type Alias[T = int, U] = ...

View File

@@ -1,9 +1,10 @@
use std::fmt::{self, Display};
use ruff_python_ast::PythonVersion;
use ruff_python_ast::token::TokenKind;
use ruff_text_size::{Ranged, TextRange};
use crate::{TokenKind, string::InterpolatedStringKind};
use crate::string::InterpolatedStringKind;
/// Represents represent errors that occur during parsing and are
/// returned by the `parse_*` functions.

View File

@@ -14,6 +14,7 @@ use unicode_normalization::UnicodeNormalization;
use ruff_python_ast::name::Name;
use ruff_python_ast::str_prefix::{AnyStringPrefix, StringLiteralPrefix};
use ruff_python_ast::token::{TokenFlags, TokenKind};
use ruff_python_ast::{Int, IpyEscapeKind, StringFlags};
use ruff_python_trivia::is_python_whitespace;
use ruff_text_size::{TextLen, TextRange, TextSize};
@@ -26,7 +27,7 @@ use crate::lexer::interpolated_string::{
InterpolatedStringContext, InterpolatedStrings, InterpolatedStringsCheckpoint,
};
use crate::string::InterpolatedStringKind;
use crate::token::{TokenFlags, TokenKind, TokenValue};
use crate::token::TokenValue;
mod cursor;
mod indentation;

View File

@@ -63,23 +63,20 @@
//! [lexical analysis]: https://en.wikipedia.org/wiki/Lexical_analysis
//! [parsing]: https://en.wikipedia.org/wiki/Parsing
//! [lexer]: crate::lexer
use std::iter::FusedIterator;
use std::ops::Deref;
pub use crate::error::{
InterpolatedStringErrorType, LexicalErrorType, ParseError, ParseErrorType,
UnsupportedSyntaxError, UnsupportedSyntaxErrorKind,
};
pub use crate::parser::ParseOptions;
pub use crate::token::{Token, TokenKind};
use crate::parser::Parser;
use ruff_python_ast::token::Tokens;
use ruff_python_ast::{
Expr, Mod, ModExpression, ModModule, PySourceType, StringFlags, StringLiteral, Suite,
};
use ruff_python_trivia::CommentRanges;
use ruff_text_size::{Ranged, TextRange, TextSize};
use ruff_text_size::{Ranged, TextRange};
mod error;
pub mod lexer;
@@ -473,351 +470,6 @@ impl Parsed<ModExpression> {
}
}
/// Tokens represents a vector of lexed [`Token`].
#[derive(Debug, Clone, PartialEq, Eq, get_size2::GetSize)]
pub struct Tokens {
raw: Vec<Token>,
}
impl Tokens {
pub(crate) fn new(tokens: Vec<Token>) -> Tokens {
Tokens { raw: tokens }
}
/// Returns an iterator over all the tokens that provides context.
pub fn iter_with_context(&self) -> TokenIterWithContext<'_> {
TokenIterWithContext::new(&self.raw)
}
/// Performs a binary search to find the index of the **first** token that starts at the given `offset`.
///
/// Unlike `binary_search_by_key`, this method ensures that if multiple tokens start at the same offset,
/// it returns the index of the first one. Multiple tokens can start at the same offset in cases where
/// zero-length tokens are involved (like `Dedent` or `Newline` at the end of the file).
pub fn binary_search_by_start(&self, offset: TextSize) -> Result<usize, usize> {
let partition_point = self.partition_point(|token| token.start() < offset);
let after = &self[partition_point..];
if after.first().is_some_and(|first| first.start() == offset) {
Ok(partition_point)
} else {
Err(partition_point)
}
}
/// Returns a slice of [`Token`] that are within the given `range`.
///
/// The start and end offset of the given range should be either:
/// 1. Token boundary
/// 2. Gap between the tokens
///
/// For example, considering the following tokens and their corresponding range:
///
/// | Token | Range |
/// |---------------------|-----------|
/// | `Def` | `0..3` |
/// | `Name` | `4..7` |
/// | `Lpar` | `7..8` |
/// | `Rpar` | `8..9` |
/// | `Colon` | `9..10` |
/// | `Newline` | `10..11` |
/// | `Comment` | `15..24` |
/// | `NonLogicalNewline` | `24..25` |
/// | `Indent` | `25..29` |
/// | `Pass` | `29..33` |
///
/// Here, for (1) a token boundary is considered either the start or end offset of any of the
/// above tokens. For (2), the gap would be any offset between the `Newline` and `Comment`
/// token which are 12, 13, and 14.
///
/// Examples:
/// 1) `4..10` would give `Name`, `Lpar`, `Rpar`, `Colon`
/// 2) `11..25` would give `Comment`, `NonLogicalNewline`
/// 3) `12..25` would give same as (2) and offset 12 is in the "gap"
/// 4) `9..12` would give `Colon`, `Newline` and offset 12 is in the "gap"
/// 5) `18..27` would panic because both the start and end offset is within a token
///
/// ## Note
///
/// The returned slice can contain the [`TokenKind::Unknown`] token if there was a lexical
/// error encountered within the given range.
///
/// # Panics
///
/// If either the start or end offset of the given range is within a token range.
pub fn in_range(&self, range: TextRange) -> &[Token] {
let tokens_after_start = self.after(range.start());
Self::before_impl(tokens_after_start, range.end())
}
/// Searches the token(s) at `offset`.
///
/// Returns [`TokenAt::Between`] if `offset` points directly inbetween two tokens
/// (the left token ends at `offset` and the right token starts at `offset`).
///
///
/// ## Examples
///
/// [Playground](https://play.ruff.rs/f3ad0a55-5931-4a13-96c7-b2b8bfdc9a2e?secondary=Tokens)
///
/// ```
/// # use ruff_python_ast::PySourceType;
/// # use ruff_python_parser::{Token, TokenAt, TokenKind};
/// # use ruff_text_size::{Ranged, TextSize};
///
/// let source = r#"
/// def test(arg):
/// arg.call()
/// if True:
/// pass
/// print("true")
/// "#.trim();
///
/// let parsed = ruff_python_parser::parse_unchecked_source(source, PySourceType::Python);
/// let tokens = parsed.tokens();
///
/// let collect_tokens = |offset: TextSize| {
/// tokens.at_offset(offset).into_iter().map(|t| (t.kind(), &source[t.range()])).collect::<Vec<_>>()
/// };
///
/// assert_eq!(collect_tokens(TextSize::new(4)), vec! [(TokenKind::Name, "test")]);
/// assert_eq!(collect_tokens(TextSize::new(6)), vec! [(TokenKind::Name, "test")]);
/// // between `arg` and `.`
/// assert_eq!(collect_tokens(TextSize::new(22)), vec! [(TokenKind::Name, "arg"), (TokenKind::Dot, ".")]);
/// assert_eq!(collect_tokens(TextSize::new(36)), vec! [(TokenKind::If, "if")]);
/// // Before the dedent token
/// assert_eq!(collect_tokens(TextSize::new(57)), vec! []);
/// ```
pub fn at_offset(&self, offset: TextSize) -> TokenAt {
match self.binary_search_by_start(offset) {
// The token at `index` starts exactly at `offset.
// ```python
// object.attribute
// ^ OFFSET
// ```
Ok(index) => {
let token = self[index];
// `token` starts exactly at `offset`. Test if the offset is right between
// `token` and the previous token (if there's any)
if let Some(previous) = index.checked_sub(1).map(|idx| self[idx]) {
if previous.end() == offset {
return TokenAt::Between(previous, token);
}
}
TokenAt::Single(token)
}
// No token found that starts exactly at the given offset. But it's possible that
// the token starting before `offset` fully encloses `offset` (it's end range ends after `offset`).
// ```python
// object.attribute
// ^ OFFSET
// # or
// if True:
// print("test")
// ^ OFFSET
// ```
Err(index) => {
if let Some(previous) = index.checked_sub(1).map(|idx| self[idx]) {
if previous.range().contains_inclusive(offset) {
return TokenAt::Single(previous);
}
}
TokenAt::None
}
}
}
/// Returns a slice of tokens before the given [`TextSize`] offset.
///
/// If the given offset is between two tokens, the returned slice will end just before the
/// following token. In other words, if the offset is between the end of previous token and
/// start of next token, the returned slice will end just before the next token.
///
/// # Panics
///
/// If the given offset is inside a token range at any point
/// other than the start of the range.
pub fn before(&self, offset: TextSize) -> &[Token] {
Self::before_impl(&self.raw, offset)
}
fn before_impl(tokens: &[Token], offset: TextSize) -> &[Token] {
let partition_point = tokens.partition_point(|token| token.start() < offset);
let before = &tokens[..partition_point];
if let Some(last) = before.last() {
// If it's equal to the end offset, then it's at a token boundary which is
// valid. If it's greater than the end offset, then it's in the gap between
// the tokens which is valid as well.
assert!(
offset >= last.end(),
"Offset {:?} is inside a token range {:?}",
offset,
last.range()
);
}
before
}
/// Returns a slice of tokens after the given [`TextSize`] offset.
///
/// If the given offset is between two tokens, the returned slice will start from the following
/// token. In other words, if the offset is between the end of previous token and start of next
/// token, the returned slice will start from the next token.
///
/// # Panics
///
/// If the given offset is inside a token range at any point
/// other than the start of the range.
pub fn after(&self, offset: TextSize) -> &[Token] {
let partition_point = self.partition_point(|token| token.end() <= offset);
let after = &self[partition_point..];
if let Some(first) = after.first() {
// valid. If it's greater than the end offset, then it's in the gap between
// the tokens which is valid as well.
assert!(
offset <= first.start(),
"Offset {:?} is inside a token range {:?}",
offset,
first.range()
);
}
after
}
}
impl<'a> IntoIterator for &'a Tokens {
type Item = &'a Token;
type IntoIter = std::slice::Iter<'a, Token>;
fn into_iter(self) -> Self::IntoIter {
self.iter()
}
}
impl Deref for Tokens {
type Target = [Token];
fn deref(&self) -> &Self::Target {
&self.raw
}
}
/// A token that encloses a given offset or ends exactly at it.
#[derive(Debug, Clone)]
pub enum TokenAt {
/// There's no token at the given offset
None,
/// There's a single token at the given offset.
Single(Token),
/// The offset falls exactly between two tokens. E.g. `CURSOR` in `call<CURSOR>(arguments)` is
/// positioned exactly between the `call` and `(` tokens.
Between(Token, Token),
}
impl Iterator for TokenAt {
type Item = Token;
fn next(&mut self) -> Option<Self::Item> {
match *self {
TokenAt::None => None,
TokenAt::Single(token) => {
*self = TokenAt::None;
Some(token)
}
TokenAt::Between(first, second) => {
*self = TokenAt::Single(second);
Some(first)
}
}
}
}
impl FusedIterator for TokenAt {}
impl From<&Tokens> for CommentRanges {
fn from(tokens: &Tokens) -> Self {
let mut ranges = vec![];
for token in tokens {
if token.kind() == TokenKind::Comment {
ranges.push(token.range());
}
}
CommentRanges::new(ranges)
}
}
/// An iterator over the [`Token`]s with context.
///
/// This struct is created by the [`iter_with_context`] method on [`Tokens`]. Refer to its
/// documentation for more details.
///
/// [`iter_with_context`]: Tokens::iter_with_context
#[derive(Debug, Clone)]
pub struct TokenIterWithContext<'a> {
inner: std::slice::Iter<'a, Token>,
nesting: u32,
}
impl<'a> TokenIterWithContext<'a> {
fn new(tokens: &'a [Token]) -> TokenIterWithContext<'a> {
TokenIterWithContext {
inner: tokens.iter(),
nesting: 0,
}
}
/// Return the nesting level the iterator is currently in.
pub const fn nesting(&self) -> u32 {
self.nesting
}
/// Returns `true` if the iterator is within a parenthesized context.
pub const fn in_parenthesized_context(&self) -> bool {
self.nesting > 0
}
/// Returns the next [`Token`] in the iterator without consuming it.
pub fn peek(&self) -> Option<&'a Token> {
self.clone().next()
}
}
impl<'a> Iterator for TokenIterWithContext<'a> {
type Item = &'a Token;
fn next(&mut self) -> Option<Self::Item> {
let token = self.inner.next()?;
match token.kind() {
TokenKind::Lpar | TokenKind::Lbrace | TokenKind::Lsqb => self.nesting += 1,
TokenKind::Rpar | TokenKind::Rbrace | TokenKind::Rsqb => {
self.nesting = self.nesting.saturating_sub(1);
}
// This mimics the behavior of re-lexing which reduces the nesting level on the lexer.
// We don't need to reduce it by 1 because unlike the lexer we see the final token
// after recovering from every unclosed parenthesis.
TokenKind::Newline if self.nesting > 0 => {
self.nesting = 0;
}
_ => {}
}
Some(token)
}
}
impl FusedIterator for TokenIterWithContext<'_> {}
/// Control in the different modes by which a source file can be parsed.
///
/// The mode argument specifies in what way code must be parsed.
@@ -888,204 +540,3 @@ impl std::fmt::Display for ModeParseError {
write!(f, r#"mode must be "exec", "eval", "ipython", or "single""#)
}
}
#[cfg(test)]
mod tests {
use std::ops::Range;
use crate::token::TokenFlags;
use super::*;
/// Test case containing a "gap" between two tokens.
///
/// Code: <https://play.ruff.rs/a3658340-6df8-42c5-be80-178744bf1193>
const TEST_CASE_WITH_GAP: [(TokenKind, Range<u32>); 10] = [
(TokenKind::Def, 0..3),
(TokenKind::Name, 4..7),
(TokenKind::Lpar, 7..8),
(TokenKind::Rpar, 8..9),
(TokenKind::Colon, 9..10),
(TokenKind::Newline, 10..11),
// Gap ||..||
(TokenKind::Comment, 15..24),
(TokenKind::NonLogicalNewline, 24..25),
(TokenKind::Indent, 25..29),
(TokenKind::Pass, 29..33),
// No newline at the end to keep the token set full of unique tokens
];
/// Helper function to create [`Tokens`] from an iterator of (kind, range).
fn new_tokens(tokens: impl Iterator<Item = (TokenKind, Range<u32>)>) -> Tokens {
Tokens::new(
tokens
.map(|(kind, range)| {
Token::new(
kind,
TextRange::new(TextSize::new(range.start), TextSize::new(range.end)),
TokenFlags::empty(),
)
})
.collect(),
)
}
#[test]
fn tokens_after_offset_at_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(8));
assert_eq!(after.len(), 7);
assert_eq!(after.first().unwrap().kind(), TokenKind::Rpar);
}
#[test]
fn tokens_after_offset_at_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(11));
assert_eq!(after.len(), 4);
assert_eq!(after.first().unwrap().kind(), TokenKind::Comment);
}
#[test]
fn tokens_after_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(13));
assert_eq!(after.len(), 4);
assert_eq!(after.first().unwrap().kind(), TokenKind::Comment);
}
#[test]
fn tokens_after_offset_at_last_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let after = tokens.after(TextSize::new(33));
assert_eq!(after.len(), 0);
}
#[test]
#[should_panic(expected = "Offset 5 is inside a token range 4..7")]
fn tokens_after_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.after(TextSize::new(5));
}
#[test]
fn tokens_before_offset_at_first_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(0));
assert_eq!(before.len(), 0);
}
#[test]
fn tokens_before_offset_after_first_token_gap() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(3));
assert_eq!(before.len(), 1);
assert_eq!(before.last().unwrap().kind(), TokenKind::Def);
}
#[test]
fn tokens_before_offset_at_second_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(4));
assert_eq!(before.len(), 1);
assert_eq!(before.last().unwrap().kind(), TokenKind::Def);
}
#[test]
fn tokens_before_offset_at_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(8));
assert_eq!(before.len(), 3);
assert_eq!(before.last().unwrap().kind(), TokenKind::Lpar);
}
#[test]
fn tokens_before_offset_at_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(11));
assert_eq!(before.len(), 6);
assert_eq!(before.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
fn tokens_before_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(13));
assert_eq!(before.len(), 6);
assert_eq!(before.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
fn tokens_before_offset_at_last_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let before = tokens.before(TextSize::new(33));
assert_eq!(before.len(), 10);
assert_eq!(before.last().unwrap().kind(), TokenKind::Pass);
}
#[test]
#[should_panic(expected = "Offset 5 is inside a token range 4..7")]
fn tokens_before_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.before(TextSize::new(5));
}
#[test]
fn tokens_in_range_at_token_offset() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(4.into(), 10.into()));
assert_eq!(in_range.len(), 4);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Name);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Colon);
}
#[test]
fn tokens_in_range_start_offset_at_token_end() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(11.into(), 29.into()));
assert_eq!(in_range.len(), 3);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Comment);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Indent);
}
#[test]
fn tokens_in_range_end_offset_at_token_start() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(8.into(), 15.into()));
assert_eq!(in_range.len(), 3);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Rpar);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
fn tokens_in_range_start_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(13.into(), 29.into()));
assert_eq!(in_range.len(), 3);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Comment);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Indent);
}
#[test]
fn tokens_in_range_end_offset_between_tokens() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
let in_range = tokens.in_range(TextRange::new(9.into(), 13.into()));
assert_eq!(in_range.len(), 2);
assert_eq!(in_range.first().unwrap().kind(), TokenKind::Colon);
assert_eq!(in_range.last().unwrap().kind(), TokenKind::Newline);
}
#[test]
#[should_panic(expected = "Offset 5 is inside a token range 4..7")]
fn tokens_in_range_start_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.in_range(TextRange::new(5.into(), 10.into()));
}
#[test]
#[should_panic(expected = "Offset 6 is inside a token range 4..7")]
fn tokens_in_range_end_offset_inside_token() {
let tokens = new_tokens(TEST_CASE_WITH_GAP.into_iter());
tokens.in_range(TextRange::new(0.into(), 6.into()));
}
}

View File

@@ -4,6 +4,7 @@ use bitflags::bitflags;
use rustc_hash::{FxBuildHasher, FxHashSet};
use ruff_python_ast::name::Name;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{
self as ast, AnyStringFlags, AtomicNodeIndex, BoolOp, CmpOp, ConversionFlag, Expr, ExprContext,
FString, InterpolatedStringElement, InterpolatedStringElements, IpyEscapeKind, Number,
@@ -18,7 +19,7 @@ use crate::string::{
InterpolatedStringKind, StringType, parse_interpolated_string_literal_element,
parse_string_literal,
};
use crate::token::{TokenKind, TokenValue};
use crate::token::TokenValue;
use crate::token_set::TokenSet;
use crate::{
InterpolatedStringErrorType, Mode, ParseErrorType, UnsupportedSyntaxError,

View File

@@ -1,7 +1,8 @@
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{self as ast, CmpOp, Expr, ExprContext, Number};
use ruff_text_size::{Ranged, TextRange};
use crate::{TokenKind, error::RelaxedDecoratorError};
use crate::error::RelaxedDecoratorError;
/// Set the `ctx` for `Expr::Id`, `Expr::Attribute`, `Expr::Subscript`, `Expr::Starred`,
/// `Expr::Tuple` and `Expr::List`. If `expr` is either `Expr::Tuple` or `Expr::List`,

View File

@@ -2,6 +2,7 @@ use std::cmp::Ordering;
use bitflags::bitflags;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{AtomicNodeIndex, Mod, ModExpression, ModModule};
use ruff_text_size::{Ranged, TextRange, TextSize};
@@ -12,7 +13,7 @@ use crate::string::InterpolatedStringKind;
use crate::token::TokenValue;
use crate::token_set::TokenSet;
use crate::token_source::{TokenSource, TokenSourceCheckpoint};
use crate::{Mode, ParseError, ParseErrorType, TokenKind, UnsupportedSyntaxErrorKind};
use crate::{Mode, ParseError, ParseErrorType, UnsupportedSyntaxErrorKind};
use crate::{Parsed, Tokens};
pub use crate::parser::options::ParseOptions;

View File

@@ -1,4 +1,5 @@
use ruff_python_ast::name::Name;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{
self as ast, AtomicNodeIndex, Expr, ExprContext, Number, Operator, Pattern, Singleton,
};
@@ -7,7 +8,7 @@ use ruff_text_size::{Ranged, TextSize};
use crate::ParseErrorType;
use crate::parser::progress::ParserProgress;
use crate::parser::{Parser, RecoveryContextKind, SequenceMatchPatternParentheses, recovery};
use crate::token::{TokenKind, TokenValue};
use crate::token::TokenValue;
use crate::token_set::TokenSet;
use super::expression::ExpressionContext;

View File

@@ -2,6 +2,7 @@ use compact_str::CompactString;
use std::fmt::{Display, Write};
use ruff_python_ast::name::Name;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{
self as ast, AtomicNodeIndex, ExceptHandler, Expr, ExprContext, IpyEscapeKind, Operator,
PythonVersion, Stmt, WithItem,
@@ -14,7 +15,7 @@ use crate::parser::progress::ParserProgress;
use crate::parser::{
FunctionKind, Parser, RecoveryContext, RecoveryContextKind, WithItemKind, helpers,
};
use crate::token::{TokenKind, TokenValue};
use crate::token::TokenValue;
use crate::token_set::TokenSet;
use crate::{Mode, ParseErrorType, UnsupportedSyntaxErrorKind};

View File

@@ -144,11 +144,16 @@ impl SemanticSyntaxChecker {
}
}
}
Stmt::ClassDef(ast::StmtClassDef { type_params, .. })
| Stmt::TypeAlias(ast::StmtTypeAlias { type_params, .. }) => {
if let Some(type_params) = type_params {
Self::duplicate_type_parameter_name(type_params, ctx);
}
Stmt::ClassDef(ast::StmtClassDef {
type_params: Some(type_params),
..
})
| Stmt::TypeAlias(ast::StmtTypeAlias {
type_params: Some(type_params),
..
}) => {
Self::duplicate_type_parameter_name(type_params, ctx);
Self::type_parameter_default_order(type_params, ctx);
}
Stmt::Assign(ast::StmtAssign { targets, value, .. }) => {
if let [Expr::Starred(ast::ExprStarred { range, .. })] = targets.as_slice() {
@@ -611,6 +616,39 @@ impl SemanticSyntaxChecker {
}
}
fn type_parameter_default_order<Ctx: SemanticSyntaxContext>(
type_params: &ast::TypeParams,
ctx: &Ctx,
) {
let mut seen_default = false;
for type_param in type_params.iter() {
let has_default = match type_param {
ast::TypeParam::TypeVar(ast::TypeParamTypeVar { default, .. })
| ast::TypeParam::TypeVarTuple(ast::TypeParamTypeVarTuple { default, .. })
| ast::TypeParam::ParamSpec(ast::TypeParamParamSpec { default, .. }) => {
default.is_some()
}
};
if seen_default && !has_default {
// test_err type_parameter_default_order
// class C[T = int, U]: ...
// class C[T1, T2 = int, T3, T4]: ...
// type Alias[T = int, U] = ...
Self::add_error(
ctx,
SemanticSyntaxErrorKind::TypeParameterDefaultOrder(
type_param.name().id.to_string(),
),
type_param.range(),
);
}
if has_default {
seen_default = true;
}
}
}
fn duplicate_parameter_name<Ctx: SemanticSyntaxContext>(
parameters: &ast::Parameters,
ctx: &Ctx,
@@ -896,7 +934,7 @@ impl SemanticSyntaxChecker {
// This check is required in addition to avoiding calling this function in `visit_expr`
// because the generator scope applies to nested parts of the `Expr::Generator` that are
// visited separately.
if ctx.in_generator_scope() {
if ctx.in_generator_context() {
return;
}
Self::add_error(
@@ -1066,6 +1104,12 @@ impl Display for SemanticSyntaxError {
SemanticSyntaxErrorKind::DuplicateTypeParameter => {
f.write_str("duplicate type parameter")
}
SemanticSyntaxErrorKind::TypeParameterDefaultOrder(name) => {
write!(
f,
"non default type parameter `{name}` follows default type parameter"
)
}
SemanticSyntaxErrorKind::MultipleCaseAssignment(name) => {
write!(f, "multiple assignments to name `{name}` in pattern")
}
@@ -1572,6 +1616,9 @@ pub enum SemanticSyntaxErrorKind {
/// Represents a nonlocal statement for a name that has no binding in an enclosing scope.
NonlocalWithoutBinding(String),
/// Represents a default type parameter followed by a non-default type parameter.
TypeParameterDefaultOrder(String),
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, get_size2::GetSize)]
@@ -2096,11 +2143,11 @@ pub trait SemanticSyntaxContext {
/// Returns `true` if the visitor is in a function scope.
fn in_function_scope(&self) -> bool;
/// Returns `true` if the visitor is in a generator scope.
/// Returns `true` if the visitor is within a generator scope.
///
/// Note that this refers to an `Expr::Generator` precisely, not to comprehensions more
/// generally.
fn in_generator_scope(&self) -> bool;
fn in_generator_context(&self) -> bool;
/// Returns `true` if the source file is a Jupyter notebook.
fn in_notebook(&self) -> bool;

View File

@@ -3,13 +3,11 @@
use bstr::ByteSlice;
use std::fmt;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{self as ast, AnyStringFlags, AtomicNodeIndex, Expr, StringFlags};
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::{
TokenKind,
error::{LexicalError, LexicalErrorType},
};
use crate::error::{LexicalError, LexicalErrorType};
#[derive(Debug)]
pub(crate) enum StringType {

View File

@@ -1,848 +1,4 @@
//! Token kinds for Python source code created by the lexer and consumed by the `ruff_python_parser`.
//!
//! This module defines the tokens that the lexer recognizes. The tokens are
//! loosely based on the token definitions found in the [CPython source].
//!
//! [CPython source]: https://github.com/python/cpython/blob/dfc2e065a2e71011017077e549cd2f9bf4944c54/Grammar/Tokens
use std::fmt;
use bitflags::bitflags;
use ruff_python_ast::name::Name;
use ruff_python_ast::str::{Quote, TripleQuotes};
use ruff_python_ast::str_prefix::{
AnyStringPrefix, ByteStringPrefix, FStringPrefix, StringLiteralPrefix, TStringPrefix,
};
use ruff_python_ast::{AnyStringFlags, BoolOp, Int, IpyEscapeKind, Operator, StringFlags, UnaryOp};
use ruff_text_size::{Ranged, TextRange};
#[derive(Clone, Copy, PartialEq, Eq, get_size2::GetSize)]
pub struct Token {
/// The kind of the token.
kind: TokenKind,
/// The range of the token.
range: TextRange,
/// The set of flags describing this token.
flags: TokenFlags,
}
impl Token {
pub(crate) fn new(kind: TokenKind, range: TextRange, flags: TokenFlags) -> Token {
Self { kind, range, flags }
}
/// Returns the token kind.
#[inline]
pub const fn kind(&self) -> TokenKind {
self.kind
}
/// Returns the token as a tuple of (kind, range).
#[inline]
pub const fn as_tuple(&self) -> (TokenKind, TextRange) {
(self.kind, self.range)
}
/// Returns `true` if the current token is a triple-quoted string of any kind.
///
/// # Panics
///
/// If it isn't a string or any f/t-string tokens.
pub fn is_triple_quoted_string(self) -> bool {
self.unwrap_string_flags().is_triple_quoted()
}
/// Returns the [`Quote`] style for the current string token of any kind.
///
/// # Panics
///
/// If it isn't a string or any f/t-string tokens.
pub fn string_quote_style(self) -> Quote {
self.unwrap_string_flags().quote_style()
}
/// Returns the [`AnyStringFlags`] style for the current string token of any kind.
///
/// # Panics
///
/// If it isn't a string or any f/t-string tokens.
pub fn unwrap_string_flags(self) -> AnyStringFlags {
self.string_flags()
.unwrap_or_else(|| panic!("token to be a string"))
}
/// Returns true if the current token is a string and it is raw.
pub fn string_flags(self) -> Option<AnyStringFlags> {
if self.is_any_string() {
Some(self.flags.as_any_string_flags())
} else {
None
}
}
/// Returns `true` if this is any kind of string token - including
/// tokens in t-strings (which do not have type `str`).
const fn is_any_string(self) -> bool {
matches!(
self.kind,
TokenKind::String
| TokenKind::FStringStart
| TokenKind::FStringMiddle
| TokenKind::FStringEnd
| TokenKind::TStringStart
| TokenKind::TStringMiddle
| TokenKind::TStringEnd
)
}
}
impl Ranged for Token {
fn range(&self) -> TextRange {
self.range
}
}
impl fmt::Debug for Token {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{:?} {:?}", self.kind, self.range)?;
if !self.flags.is_empty() {
f.write_str(" (flags = ")?;
let mut first = true;
for (name, _) in self.flags.iter_names() {
if first {
first = false;
} else {
f.write_str(" | ")?;
}
f.write_str(name)?;
}
f.write_str(")")?;
}
Ok(())
}
}
/// A kind of a token.
#[derive(Copy, Clone, PartialEq, Eq, Hash, Debug, PartialOrd, Ord, get_size2::GetSize)]
pub enum TokenKind {
/// Token kind for a name, commonly known as an identifier.
Name,
/// Token kind for an integer.
Int,
/// Token kind for a floating point number.
Float,
/// Token kind for a complex number.
Complex,
/// Token kind for a string.
String,
/// Token kind for the start of an f-string. This includes the `f`/`F`/`fr` prefix
/// and the opening quote(s).
FStringStart,
/// Token kind that includes the portion of text inside the f-string that's not
/// part of the expression part and isn't an opening or closing brace.
FStringMiddle,
/// Token kind for the end of an f-string. This includes the closing quote.
FStringEnd,
/// Token kind for the start of a t-string. This includes the `t`/`T`/`tr` prefix
/// and the opening quote(s).
TStringStart,
/// Token kind that includes the portion of text inside the t-string that's not
/// part of the interpolation part and isn't an opening or closing brace.
TStringMiddle,
/// Token kind for the end of a t-string. This includes the closing quote.
TStringEnd,
/// Token kind for a IPython escape command.
IpyEscapeCommand,
/// Token kind for a comment. These are filtered out of the token stream prior to parsing.
Comment,
/// Token kind for a newline.
Newline,
/// Token kind for a newline that is not a logical line break. These are filtered out of
/// the token stream prior to parsing.
NonLogicalNewline,
/// Token kind for an indent.
Indent,
/// Token kind for a dedent.
Dedent,
EndOfFile,
/// Token kind for a question mark `?`.
Question,
/// Token kind for an exclamation mark `!`.
Exclamation,
/// Token kind for a left parenthesis `(`.
Lpar,
/// Token kind for a right parenthesis `)`.
Rpar,
/// Token kind for a left square bracket `[`.
Lsqb,
/// Token kind for a right square bracket `]`.
Rsqb,
/// Token kind for a colon `:`.
Colon,
/// Token kind for a comma `,`.
Comma,
/// Token kind for a semicolon `;`.
Semi,
/// Token kind for plus `+`.
Plus,
/// Token kind for minus `-`.
Minus,
/// Token kind for star `*`.
Star,
/// Token kind for slash `/`.
Slash,
/// Token kind for vertical bar `|`.
Vbar,
/// Token kind for ampersand `&`.
Amper,
/// Token kind for less than `<`.
Less,
/// Token kind for greater than `>`.
Greater,
/// Token kind for equal `=`.
Equal,
/// Token kind for dot `.`.
Dot,
/// Token kind for percent `%`.
Percent,
/// Token kind for left bracket `{`.
Lbrace,
/// Token kind for right bracket `}`.
Rbrace,
/// Token kind for double equal `==`.
EqEqual,
/// Token kind for not equal `!=`.
NotEqual,
/// Token kind for less than or equal `<=`.
LessEqual,
/// Token kind for greater than or equal `>=`.
GreaterEqual,
/// Token kind for tilde `~`.
Tilde,
/// Token kind for caret `^`.
CircumFlex,
/// Token kind for left shift `<<`.
LeftShift,
/// Token kind for right shift `>>`.
RightShift,
/// Token kind for double star `**`.
DoubleStar,
/// Token kind for double star equal `**=`.
DoubleStarEqual,
/// Token kind for plus equal `+=`.
PlusEqual,
/// Token kind for minus equal `-=`.
MinusEqual,
/// Token kind for star equal `*=`.
StarEqual,
/// Token kind for slash equal `/=`.
SlashEqual,
/// Token kind for percent equal `%=`.
PercentEqual,
/// Token kind for ampersand equal `&=`.
AmperEqual,
/// Token kind for vertical bar equal `|=`.
VbarEqual,
/// Token kind for caret equal `^=`.
CircumflexEqual,
/// Token kind for left shift equal `<<=`.
LeftShiftEqual,
/// Token kind for right shift equal `>>=`.
RightShiftEqual,
/// Token kind for double slash `//`.
DoubleSlash,
/// Token kind for double slash equal `//=`.
DoubleSlashEqual,
/// Token kind for colon equal `:=`.
ColonEqual,
/// Token kind for at `@`.
At,
/// Token kind for at equal `@=`.
AtEqual,
/// Token kind for arrow `->`.
Rarrow,
/// Token kind for ellipsis `...`.
Ellipsis,
// The keywords should be sorted in alphabetical order. If the boundary tokens for the
// "Keywords" and "Soft keywords" group change, update the related methods on `TokenKind`.
// Keywords
And,
As,
Assert,
Async,
Await,
Break,
Class,
Continue,
Def,
Del,
Elif,
Else,
Except,
False,
Finally,
For,
From,
Global,
If,
Import,
In,
Is,
Lambda,
None,
Nonlocal,
Not,
Or,
Pass,
Raise,
Return,
True,
Try,
While,
With,
Yield,
// Soft keywords
Case,
Match,
Type,
Unknown,
}
impl TokenKind {
/// Returns `true` if this is an end of file token.
#[inline]
pub const fn is_eof(self) -> bool {
matches!(self, TokenKind::EndOfFile)
}
/// Returns `true` if this is either a newline or non-logical newline token.
#[inline]
pub const fn is_any_newline(self) -> bool {
matches!(self, TokenKind::Newline | TokenKind::NonLogicalNewline)
}
/// Returns `true` if the token is a keyword (including soft keywords).
///
/// See also [`is_soft_keyword`], [`is_non_soft_keyword`].
///
/// [`is_soft_keyword`]: TokenKind::is_soft_keyword
/// [`is_non_soft_keyword`]: TokenKind::is_non_soft_keyword
#[inline]
pub fn is_keyword(self) -> bool {
TokenKind::And <= self && self <= TokenKind::Type
}
/// Returns `true` if the token is strictly a soft keyword.
///
/// See also [`is_keyword`], [`is_non_soft_keyword`].
///
/// [`is_keyword`]: TokenKind::is_keyword
/// [`is_non_soft_keyword`]: TokenKind::is_non_soft_keyword
#[inline]
pub fn is_soft_keyword(self) -> bool {
TokenKind::Case <= self && self <= TokenKind::Type
}
/// Returns `true` if the token is strictly a non-soft keyword.
///
/// See also [`is_keyword`], [`is_soft_keyword`].
///
/// [`is_keyword`]: TokenKind::is_keyword
/// [`is_soft_keyword`]: TokenKind::is_soft_keyword
#[inline]
pub fn is_non_soft_keyword(self) -> bool {
TokenKind::And <= self && self <= TokenKind::Yield
}
#[inline]
pub const fn is_operator(self) -> bool {
matches!(
self,
TokenKind::Lpar
| TokenKind::Rpar
| TokenKind::Lsqb
| TokenKind::Rsqb
| TokenKind::Comma
| TokenKind::Semi
| TokenKind::Plus
| TokenKind::Minus
| TokenKind::Star
| TokenKind::Slash
| TokenKind::Vbar
| TokenKind::Amper
| TokenKind::Less
| TokenKind::Greater
| TokenKind::Equal
| TokenKind::Dot
| TokenKind::Percent
| TokenKind::Lbrace
| TokenKind::Rbrace
| TokenKind::EqEqual
| TokenKind::NotEqual
| TokenKind::LessEqual
| TokenKind::GreaterEqual
| TokenKind::Tilde
| TokenKind::CircumFlex
| TokenKind::LeftShift
| TokenKind::RightShift
| TokenKind::DoubleStar
| TokenKind::PlusEqual
| TokenKind::MinusEqual
| TokenKind::StarEqual
| TokenKind::SlashEqual
| TokenKind::PercentEqual
| TokenKind::AmperEqual
| TokenKind::VbarEqual
| TokenKind::CircumflexEqual
| TokenKind::LeftShiftEqual
| TokenKind::RightShiftEqual
| TokenKind::DoubleStarEqual
| TokenKind::DoubleSlash
| TokenKind::DoubleSlashEqual
| TokenKind::At
| TokenKind::AtEqual
| TokenKind::Rarrow
| TokenKind::Ellipsis
| TokenKind::ColonEqual
| TokenKind::Colon
| TokenKind::And
| TokenKind::Or
| TokenKind::Not
| TokenKind::In
| TokenKind::Is
)
}
/// Returns `true` if this is a singleton token i.e., `True`, `False`, or `None`.
#[inline]
pub const fn is_singleton(self) -> bool {
matches!(self, TokenKind::False | TokenKind::True | TokenKind::None)
}
/// Returns `true` if this is a trivia token i.e., a comment or a non-logical newline.
#[inline]
pub const fn is_trivia(&self) -> bool {
matches!(self, TokenKind::Comment | TokenKind::NonLogicalNewline)
}
/// Returns `true` if this is a comment token.
#[inline]
pub const fn is_comment(&self) -> bool {
matches!(self, TokenKind::Comment)
}
#[inline]
pub const fn is_arithmetic(self) -> bool {
matches!(
self,
TokenKind::DoubleStar
| TokenKind::Star
| TokenKind::Plus
| TokenKind::Minus
| TokenKind::Slash
| TokenKind::DoubleSlash
| TokenKind::At
)
}
#[inline]
pub const fn is_bitwise_or_shift(self) -> bool {
matches!(
self,
TokenKind::LeftShift
| TokenKind::LeftShiftEqual
| TokenKind::RightShift
| TokenKind::RightShiftEqual
| TokenKind::Amper
| TokenKind::AmperEqual
| TokenKind::Vbar
| TokenKind::VbarEqual
| TokenKind::CircumFlex
| TokenKind::CircumflexEqual
| TokenKind::Tilde
)
}
/// Returns `true` if the current token is a unary arithmetic operator.
#[inline]
pub const fn is_unary_arithmetic_operator(self) -> bool {
matches!(self, TokenKind::Plus | TokenKind::Minus)
}
#[inline]
pub const fn is_interpolated_string_end(self) -> bool {
matches!(self, TokenKind::FStringEnd | TokenKind::TStringEnd)
}
/// Returns the [`UnaryOp`] that corresponds to this token kind, if it is a unary arithmetic
/// operator, otherwise return [None].
///
/// Use [`as_unary_operator`] to match against any unary operator.
///
/// [`as_unary_operator`]: TokenKind::as_unary_operator
#[inline]
pub const fn as_unary_arithmetic_operator(self) -> Option<UnaryOp> {
Some(match self {
TokenKind::Plus => UnaryOp::UAdd,
TokenKind::Minus => UnaryOp::USub,
_ => return None,
})
}
/// Returns the [`UnaryOp`] that corresponds to this token kind, if it is a unary operator,
/// otherwise return [None].
///
/// Use [`as_unary_arithmetic_operator`] to match against only an arithmetic unary operator.
///
/// [`as_unary_arithmetic_operator`]: TokenKind::as_unary_arithmetic_operator
#[inline]
pub const fn as_unary_operator(self) -> Option<UnaryOp> {
Some(match self {
TokenKind::Plus => UnaryOp::UAdd,
TokenKind::Minus => UnaryOp::USub,
TokenKind::Tilde => UnaryOp::Invert,
TokenKind::Not => UnaryOp::Not,
_ => return None,
})
}
/// Returns the [`BoolOp`] that corresponds to this token kind, if it is a boolean operator,
/// otherwise return [None].
#[inline]
pub const fn as_bool_operator(self) -> Option<BoolOp> {
Some(match self {
TokenKind::And => BoolOp::And,
TokenKind::Or => BoolOp::Or,
_ => return None,
})
}
/// Returns the binary [`Operator`] that corresponds to the current token, if it's a binary
/// operator, otherwise return [None].
///
/// Use [`as_augmented_assign_operator`] to match against an augmented assignment token.
///
/// [`as_augmented_assign_operator`]: TokenKind::as_augmented_assign_operator
pub const fn as_binary_operator(self) -> Option<Operator> {
Some(match self {
TokenKind::Plus => Operator::Add,
TokenKind::Minus => Operator::Sub,
TokenKind::Star => Operator::Mult,
TokenKind::At => Operator::MatMult,
TokenKind::DoubleStar => Operator::Pow,
TokenKind::Slash => Operator::Div,
TokenKind::DoubleSlash => Operator::FloorDiv,
TokenKind::Percent => Operator::Mod,
TokenKind::Amper => Operator::BitAnd,
TokenKind::Vbar => Operator::BitOr,
TokenKind::CircumFlex => Operator::BitXor,
TokenKind::LeftShift => Operator::LShift,
TokenKind::RightShift => Operator::RShift,
_ => return None,
})
}
/// Returns the [`Operator`] that corresponds to this token kind, if it is
/// an augmented assignment operator, or [`None`] otherwise.
#[inline]
pub const fn as_augmented_assign_operator(self) -> Option<Operator> {
Some(match self {
TokenKind::PlusEqual => Operator::Add,
TokenKind::MinusEqual => Operator::Sub,
TokenKind::StarEqual => Operator::Mult,
TokenKind::AtEqual => Operator::MatMult,
TokenKind::DoubleStarEqual => Operator::Pow,
TokenKind::SlashEqual => Operator::Div,
TokenKind::DoubleSlashEqual => Operator::FloorDiv,
TokenKind::PercentEqual => Operator::Mod,
TokenKind::AmperEqual => Operator::BitAnd,
TokenKind::VbarEqual => Operator::BitOr,
TokenKind::CircumflexEqual => Operator::BitXor,
TokenKind::LeftShiftEqual => Operator::LShift,
TokenKind::RightShiftEqual => Operator::RShift,
_ => return None,
})
}
}
impl From<BoolOp> for TokenKind {
#[inline]
fn from(op: BoolOp) -> Self {
match op {
BoolOp::And => TokenKind::And,
BoolOp::Or => TokenKind::Or,
}
}
}
impl From<UnaryOp> for TokenKind {
#[inline]
fn from(op: UnaryOp) -> Self {
match op {
UnaryOp::Invert => TokenKind::Tilde,
UnaryOp::Not => TokenKind::Not,
UnaryOp::UAdd => TokenKind::Plus,
UnaryOp::USub => TokenKind::Minus,
}
}
}
impl From<Operator> for TokenKind {
#[inline]
fn from(op: Operator) -> Self {
match op {
Operator::Add => TokenKind::Plus,
Operator::Sub => TokenKind::Minus,
Operator::Mult => TokenKind::Star,
Operator::MatMult => TokenKind::At,
Operator::Div => TokenKind::Slash,
Operator::Mod => TokenKind::Percent,
Operator::Pow => TokenKind::DoubleStar,
Operator::LShift => TokenKind::LeftShift,
Operator::RShift => TokenKind::RightShift,
Operator::BitOr => TokenKind::Vbar,
Operator::BitXor => TokenKind::CircumFlex,
Operator::BitAnd => TokenKind::Amper,
Operator::FloorDiv => TokenKind::DoubleSlash,
}
}
}
impl fmt::Display for TokenKind {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let value = match self {
TokenKind::Unknown => "Unknown",
TokenKind::Newline => "newline",
TokenKind::NonLogicalNewline => "NonLogicalNewline",
TokenKind::Indent => "indent",
TokenKind::Dedent => "dedent",
TokenKind::EndOfFile => "end of file",
TokenKind::Name => "name",
TokenKind::Int => "int",
TokenKind::Float => "float",
TokenKind::Complex => "complex",
TokenKind::String => "string",
TokenKind::FStringStart => "FStringStart",
TokenKind::FStringMiddle => "FStringMiddle",
TokenKind::FStringEnd => "FStringEnd",
TokenKind::TStringStart => "TStringStart",
TokenKind::TStringMiddle => "TStringMiddle",
TokenKind::TStringEnd => "TStringEnd",
TokenKind::IpyEscapeCommand => "IPython escape command",
TokenKind::Comment => "comment",
TokenKind::Question => "`?`",
TokenKind::Exclamation => "`!`",
TokenKind::Lpar => "`(`",
TokenKind::Rpar => "`)`",
TokenKind::Lsqb => "`[`",
TokenKind::Rsqb => "`]`",
TokenKind::Lbrace => "`{`",
TokenKind::Rbrace => "`}`",
TokenKind::Equal => "`=`",
TokenKind::ColonEqual => "`:=`",
TokenKind::Dot => "`.`",
TokenKind::Colon => "`:`",
TokenKind::Semi => "`;`",
TokenKind::Comma => "`,`",
TokenKind::Rarrow => "`->`",
TokenKind::Plus => "`+`",
TokenKind::Minus => "`-`",
TokenKind::Star => "`*`",
TokenKind::DoubleStar => "`**`",
TokenKind::Slash => "`/`",
TokenKind::DoubleSlash => "`//`",
TokenKind::Percent => "`%`",
TokenKind::Vbar => "`|`",
TokenKind::Amper => "`&`",
TokenKind::CircumFlex => "`^`",
TokenKind::LeftShift => "`<<`",
TokenKind::RightShift => "`>>`",
TokenKind::Tilde => "`~`",
TokenKind::At => "`@`",
TokenKind::Less => "`<`",
TokenKind::Greater => "`>`",
TokenKind::EqEqual => "`==`",
TokenKind::NotEqual => "`!=`",
TokenKind::LessEqual => "`<=`",
TokenKind::GreaterEqual => "`>=`",
TokenKind::PlusEqual => "`+=`",
TokenKind::MinusEqual => "`-=`",
TokenKind::StarEqual => "`*=`",
TokenKind::DoubleStarEqual => "`**=`",
TokenKind::SlashEqual => "`/=`",
TokenKind::DoubleSlashEqual => "`//=`",
TokenKind::PercentEqual => "`%=`",
TokenKind::VbarEqual => "`|=`",
TokenKind::AmperEqual => "`&=`",
TokenKind::CircumflexEqual => "`^=`",
TokenKind::LeftShiftEqual => "`<<=`",
TokenKind::RightShiftEqual => "`>>=`",
TokenKind::AtEqual => "`@=`",
TokenKind::Ellipsis => "`...`",
TokenKind::False => "`False`",
TokenKind::None => "`None`",
TokenKind::True => "`True`",
TokenKind::And => "`and`",
TokenKind::As => "`as`",
TokenKind::Assert => "`assert`",
TokenKind::Async => "`async`",
TokenKind::Await => "`await`",
TokenKind::Break => "`break`",
TokenKind::Class => "`class`",
TokenKind::Continue => "`continue`",
TokenKind::Def => "`def`",
TokenKind::Del => "`del`",
TokenKind::Elif => "`elif`",
TokenKind::Else => "`else`",
TokenKind::Except => "`except`",
TokenKind::Finally => "`finally`",
TokenKind::For => "`for`",
TokenKind::From => "`from`",
TokenKind::Global => "`global`",
TokenKind::If => "`if`",
TokenKind::Import => "`import`",
TokenKind::In => "`in`",
TokenKind::Is => "`is`",
TokenKind::Lambda => "`lambda`",
TokenKind::Nonlocal => "`nonlocal`",
TokenKind::Not => "`not`",
TokenKind::Or => "`or`",
TokenKind::Pass => "`pass`",
TokenKind::Raise => "`raise`",
TokenKind::Return => "`return`",
TokenKind::Try => "`try`",
TokenKind::While => "`while`",
TokenKind::Match => "`match`",
TokenKind::Type => "`type`",
TokenKind::Case => "`case`",
TokenKind::With => "`with`",
TokenKind::Yield => "`yield`",
};
f.write_str(value)
}
}
bitflags! {
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub(crate) struct TokenFlags: u16 {
/// The token is a string with double quotes (`"`).
const DOUBLE_QUOTES = 1 << 0;
/// The token is a triple-quoted string i.e., it starts and ends with three consecutive
/// quote characters (`"""` or `'''`).
const TRIPLE_QUOTED_STRING = 1 << 1;
/// The token is a unicode string i.e., prefixed with `u` or `U`
const UNICODE_STRING = 1 << 2;
/// The token is a byte string i.e., prefixed with `b` or `B`
const BYTE_STRING = 1 << 3;
/// The token is an f-string i.e., prefixed with `f` or `F`
const F_STRING = 1 << 4;
/// The token is a t-string i.e., prefixed with `t` or `T`
const T_STRING = 1 << 5;
/// The token is a raw string and the prefix character is in lowercase.
const RAW_STRING_LOWERCASE = 1 << 6;
/// The token is a raw string and the prefix character is in uppercase.
const RAW_STRING_UPPERCASE = 1 << 7;
/// String without matching closing quote(s)
const UNCLOSED_STRING = 1 << 8;
/// The token is a raw string i.e., prefixed with `r` or `R`
const RAW_STRING = Self::RAW_STRING_LOWERCASE.bits() | Self::RAW_STRING_UPPERCASE.bits();
}
}
impl get_size2::GetSize for TokenFlags {}
impl StringFlags for TokenFlags {
fn quote_style(self) -> Quote {
if self.intersects(TokenFlags::DOUBLE_QUOTES) {
Quote::Double
} else {
Quote::Single
}
}
fn triple_quotes(self) -> TripleQuotes {
if self.intersects(TokenFlags::TRIPLE_QUOTED_STRING) {
TripleQuotes::Yes
} else {
TripleQuotes::No
}
}
fn prefix(self) -> AnyStringPrefix {
if self.intersects(TokenFlags::F_STRING) {
if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Format(FStringPrefix::Raw { uppercase_r: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Format(FStringPrefix::Raw { uppercase_r: true })
} else {
AnyStringPrefix::Format(FStringPrefix::Regular)
}
} else if self.intersects(TokenFlags::T_STRING) {
if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Template(TStringPrefix::Raw { uppercase_r: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Template(TStringPrefix::Raw { uppercase_r: true })
} else {
AnyStringPrefix::Template(TStringPrefix::Regular)
}
} else if self.intersects(TokenFlags::BYTE_STRING) {
if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Bytes(ByteStringPrefix::Raw { uppercase_r: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Bytes(ByteStringPrefix::Raw { uppercase_r: true })
} else {
AnyStringPrefix::Bytes(ByteStringPrefix::Regular)
}
} else if self.intersects(TokenFlags::RAW_STRING_LOWERCASE) {
AnyStringPrefix::Regular(StringLiteralPrefix::Raw { uppercase: false })
} else if self.intersects(TokenFlags::RAW_STRING_UPPERCASE) {
AnyStringPrefix::Regular(StringLiteralPrefix::Raw { uppercase: true })
} else if self.intersects(TokenFlags::UNICODE_STRING) {
AnyStringPrefix::Regular(StringLiteralPrefix::Unicode)
} else {
AnyStringPrefix::Regular(StringLiteralPrefix::Empty)
}
}
fn is_unclosed(self) -> bool {
self.intersects(TokenFlags::UNCLOSED_STRING)
}
}
impl TokenFlags {
/// Returns `true` if the token is an f-string.
pub(crate) const fn is_f_string(self) -> bool {
self.intersects(TokenFlags::F_STRING)
}
/// Returns `true` if the token is a t-string.
pub(crate) const fn is_t_string(self) -> bool {
self.intersects(TokenFlags::T_STRING)
}
/// Returns `true` if the token is a t-string.
pub(crate) const fn is_interpolated_string(self) -> bool {
self.intersects(TokenFlags::T_STRING.union(TokenFlags::F_STRING))
}
/// Returns `true` if the token is a triple-quoted t-string.
pub(crate) fn is_triple_quoted_interpolated_string(self) -> bool {
self.intersects(TokenFlags::TRIPLE_QUOTED_STRING) && self.is_interpolated_string()
}
/// Returns `true` if the token is a raw string.
pub(crate) const fn is_raw_string(self) -> bool {
self.intersects(TokenFlags::RAW_STRING)
}
}
use ruff_python_ast::{Int, IpyEscapeKind, name::Name};
#[derive(Clone, Debug, Default)]
pub(crate) enum TokenValue {

View File

@@ -1,4 +1,4 @@
use crate::TokenKind;
use ruff_python_ast::token::TokenKind;
/// A bit-set of `TokenKind`s
#[derive(Clone, Copy)]
@@ -42,7 +42,7 @@ impl<const N: usize> From<[TokenKind; N]> for TokenSet {
#[test]
fn token_set_works_for_tokens() {
use crate::TokenKind::*;
use ruff_python_ast::token::TokenKind::*;
let mut ts = TokenSet::new([EndOfFile, Name]);
assert!(ts.contains(EndOfFile));
assert!(ts.contains(Name));

View File

@@ -1,10 +1,11 @@
use ruff_python_ast::token::{Token, TokenFlags, TokenKind};
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::Mode;
use crate::error::LexicalError;
use crate::lexer::{Lexer, LexerCheckpoint};
use crate::string::InterpolatedStringKind;
use crate::token::{Token, TokenFlags, TokenKind, TokenValue};
use crate::token::TokenValue;
/// Token source for the parser that skips over any trivia tokens.
#[derive(Debug)]

View File

@@ -5,13 +5,14 @@ use std::fs;
use std::path::Path;
use ruff_annotate_snippets::{Level, Renderer, Snippet};
use ruff_python_ast::token::Token;
use ruff_python_ast::visitor::Visitor;
use ruff_python_ast::visitor::source_order::{SourceOrderVisitor, TraversalSignal, walk_module};
use ruff_python_ast::{self as ast, AnyNodeRef, Mod, PythonVersion};
use ruff_python_parser::semantic_errors::{
SemanticSyntaxChecker, SemanticSyntaxContext, SemanticSyntaxError,
};
use ruff_python_parser::{Mode, ParseErrorType, ParseOptions, Token, parse_unchecked};
use ruff_python_parser::{Mode, ParseErrorType, ParseOptions, parse_unchecked};
use ruff_source_file::{LineIndex, OneIndexed, SourceCode};
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};
@@ -572,7 +573,7 @@ impl SemanticSyntaxContext for SemanticSyntaxCheckerVisitor<'_> {
true
}
fn in_generator_scope(&self) -> bool {
fn in_generator_context(&self) -> bool {
true
}

View File

@@ -375,3 +375,12 @@ Module(
4 | type X[**P = x := int] = int
5 | type X[**P = *int] = int
|
|
2 | type X[**P = yield x] = int
3 | type X[**P = yield from x] = int
4 | type X[**P = x := int] = int
| ^^^ Syntax Error: non default type parameter `int` follows default type parameter
5 | type X[**P = *int] = int
|

View File

@@ -459,3 +459,12 @@ Module(
5 | type X[T = x := int] = int
6 | type X[T: int = *int] = int
|
|
3 | type X[T = (yield x)] = int
4 | type X[T = yield from x] = int
5 | type X[T = x := int] = int
| ^^^ Syntax Error: non default type parameter `int` follows default type parameter
6 | type X[T: int = *int] = int
|

View File

@@ -384,3 +384,11 @@ Module(
| ^^^^^^^^^^^^ Syntax Error: yield expression cannot be used within a TypeVarTuple default
5 | type X[*Ts = x := int] = int
|
|
3 | type X[*Ts = yield x] = int
4 | type X[*Ts = yield from x] = int
5 | type X[*Ts = x := int] = int
| ^^^ Syntax Error: non default type parameter `int` follows default type parameter
|

View File

@@ -0,0 +1,277 @@
---
source: crates/ruff_python_parser/tests/fixtures.rs
input_file: crates/ruff_python_parser/resources/inline/err/type_parameter_default_order.py
---
## AST
```
Module(
ModModule {
node_index: NodeIndex(None),
range: 0..89,
body: [
ClassDef(
StmtClassDef {
node_index: NodeIndex(None),
range: 0..24,
decorator_list: [],
name: Identifier {
id: Name("C"),
range: 6..7,
node_index: NodeIndex(None),
},
type_params: Some(
TypeParams {
range: 7..19,
node_index: NodeIndex(None),
type_params: [
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 8..15,
name: Identifier {
id: Name("T"),
range: 8..9,
node_index: NodeIndex(None),
},
bound: None,
default: Some(
Name(
ExprName {
node_index: NodeIndex(None),
range: 12..15,
id: Name("int"),
ctx: Load,
},
),
),
},
),
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 17..18,
name: Identifier {
id: Name("U"),
range: 17..18,
node_index: NodeIndex(None),
},
bound: None,
default: None,
},
),
],
},
),
arguments: None,
body: [
Expr(
StmtExpr {
node_index: NodeIndex(None),
range: 21..24,
value: EllipsisLiteral(
ExprEllipsisLiteral {
node_index: NodeIndex(None),
range: 21..24,
},
),
},
),
],
},
),
ClassDef(
StmtClassDef {
node_index: NodeIndex(None),
range: 25..59,
decorator_list: [],
name: Identifier {
id: Name("C"),
range: 31..32,
node_index: NodeIndex(None),
},
type_params: Some(
TypeParams {
range: 32..54,
node_index: NodeIndex(None),
type_params: [
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 33..35,
name: Identifier {
id: Name("T1"),
range: 33..35,
node_index: NodeIndex(None),
},
bound: None,
default: None,
},
),
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 37..45,
name: Identifier {
id: Name("T2"),
range: 37..39,
node_index: NodeIndex(None),
},
bound: None,
default: Some(
Name(
ExprName {
node_index: NodeIndex(None),
range: 42..45,
id: Name("int"),
ctx: Load,
},
),
),
},
),
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 47..49,
name: Identifier {
id: Name("T3"),
range: 47..49,
node_index: NodeIndex(None),
},
bound: None,
default: None,
},
),
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 51..53,
name: Identifier {
id: Name("T4"),
range: 51..53,
node_index: NodeIndex(None),
},
bound: None,
default: None,
},
),
],
},
),
arguments: None,
body: [
Expr(
StmtExpr {
node_index: NodeIndex(None),
range: 56..59,
value: EllipsisLiteral(
ExprEllipsisLiteral {
node_index: NodeIndex(None),
range: 56..59,
},
),
},
),
],
},
),
TypeAlias(
StmtTypeAlias {
node_index: NodeIndex(None),
range: 60..88,
name: Name(
ExprName {
node_index: NodeIndex(None),
range: 65..70,
id: Name("Alias"),
ctx: Store,
},
),
type_params: Some(
TypeParams {
range: 70..82,
node_index: NodeIndex(None),
type_params: [
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 71..78,
name: Identifier {
id: Name("T"),
range: 71..72,
node_index: NodeIndex(None),
},
bound: None,
default: Some(
Name(
ExprName {
node_index: NodeIndex(None),
range: 75..78,
id: Name("int"),
ctx: Load,
},
),
),
},
),
TypeVar(
TypeParamTypeVar {
node_index: NodeIndex(None),
range: 80..81,
name: Identifier {
id: Name("U"),
range: 80..81,
node_index: NodeIndex(None),
},
bound: None,
default: None,
},
),
],
},
),
value: EllipsisLiteral(
ExprEllipsisLiteral {
node_index: NodeIndex(None),
range: 85..88,
},
),
},
),
],
},
)
```
## Semantic Syntax Errors
|
1 | class C[T = int, U]: ...
| ^ Syntax Error: non default type parameter `U` follows default type parameter
2 | class C[T1, T2 = int, T3, T4]: ...
3 | type Alias[T = int, U] = ...
|
|
1 | class C[T = int, U]: ...
2 | class C[T1, T2 = int, T3, T4]: ...
| ^^ Syntax Error: non default type parameter `T3` follows default type parameter
3 | type Alias[T = int, U] = ...
|
|
1 | class C[T = int, U]: ...
2 | class C[T1, T2 = int, T3, T4]: ...
| ^^ Syntax Error: non default type parameter `T4` follows default type parameter
3 | type Alias[T = int, U] = ...
|
|
1 | class C[T = int, U]: ...
2 | class C[T1, T2 = int, T3, T4]: ...
3 | type Alias[T = int, U] = ...
| ^ Syntax Error: non default type parameter `U` follows default type parameter
|

View File

@@ -19,7 +19,6 @@ ruff_memory_usage = { workspace = true }
ruff_python_ast = { workspace = true }
ruff_python_codegen = { workspace = true }
ruff_python_importer = { workspace = true }
ruff_python_parser = { workspace = true }
ruff_python_trivia = { workspace = true }
ruff_source_file = { workspace = true }
ruff_text_size = { workspace = true }
@@ -37,6 +36,8 @@ smallvec = { workspace = true }
tracing = { workspace = true }
[dev-dependencies]
ruff_python_parser = { workspace = true }
camino = { workspace = true }
insta = { workspace = true, features = ["filters"] }

View File

@@ -5,9 +5,9 @@ use ruff_db::parsed::{ParsedModuleRef, parsed_module};
use ruff_db::source::source_text;
use ruff_diagnostics::Edit;
use ruff_python_ast::name::Name;
use ruff_python_ast::token::{Token, TokenAt, TokenKind, Tokens};
use ruff_python_ast::{self as ast, AnyNodeRef};
use ruff_python_codegen::Stylist;
use ruff_python_parser::{Token, TokenAt, TokenKind, Tokens};
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};
use ty_python_semantic::types::UnionType;
use ty_python_semantic::{
@@ -1557,7 +1557,8 @@ fn compare_suggestions(c1: &Completion, c2: &Completion) -> Ordering {
#[cfg(test)]
mod tests {
use insta::assert_snapshot;
use ruff_python_parser::{Mode, ParseOptions, TokenKind, Tokens};
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_parser::{Mode, ParseOptions};
use ty_python_semantic::ModuleName;
use crate::completion::{Completion, completion};

View File

@@ -8,8 +8,8 @@ use std::borrow::Cow;
use crate::find_node::covering_node;
use crate::stub_mapping::StubMapper;
use ruff_db::parsed::ParsedModuleRef;
use ruff_python_ast::token::{TokenKind, Tokens};
use ruff_python_ast::{self as ast, AnyNodeRef};
use ruff_python_parser::{TokenKind, Tokens};
use ruff_text_size::{Ranged, TextRange, TextSize};
use ty_python_semantic::ResolvedDefinition;

View File

@@ -24,10 +24,10 @@ use ruff_db::source::source_text;
use ruff_diagnostics::Edit;
use ruff_python_ast as ast;
use ruff_python_ast::name::Name;
use ruff_python_ast::token::Tokens;
use ruff_python_ast::visitor::source_order::{SourceOrderVisitor, TraversalSignal, walk_stmt};
use ruff_python_codegen::Stylist;
use ruff_python_importer::Insertion;
use ruff_python_parser::{Parsed, Tokens};
use ruff_text_size::{Ranged, TextRange, TextSize};
use ty_project::Db;
use ty_python_semantic::semantic_index::definition::DefinitionKind;
@@ -76,7 +76,7 @@ impl<'a> Importer<'a> {
source: &'a str,
parsed: &'a ParsedModuleRef,
) -> Self {
let imports = TopLevelImports::find(parsed);
let imports = TopLevelImports::find(parsed.syntax());
Self {
db,
@@ -327,9 +327,7 @@ impl<'ast> MembersInScope<'ast> {
.members_in_scope_at(node)
.into_iter()
.map(|(name, memberdef)| {
let Some(def) = memberdef.definition else {
return (name, MemberInScope::other(memberdef.ty));
};
let def = memberdef.first_reachable_definition;
let kind = match *def.kind(db) {
DefinitionKind::Import(ref kind) => {
MemberImportKind::Imported(AstImportKind::Import(kind.import(parsed)))
@@ -749,9 +747,9 @@ struct TopLevelImports<'ast> {
impl<'ast> TopLevelImports<'ast> {
/// Find all top-level imports from the given AST of a Python module.
fn find(parsed: &'ast Parsed<ast::ModModule>) -> Vec<AstImport<'ast>> {
fn find(module: &'ast ast::ModModule) -> Vec<AstImport<'ast>> {
let mut visitor = TopLevelImports::default();
visitor.visit_body(parsed.suite());
visitor.visit_body(&module.body);
visitor.imports
}
}
@@ -1891,13 +1889,13 @@ else:
"#);
assert_snapshot!(
test.import_from("foo", "MAGIC"), @r#"
import foo
from foo import MAGIC
if os.getenv("WHATEVER"):
from foo import MAGIC
else:
from bar import MAGIC
(foo.MAGIC)
(MAGIC)
"#);
}
@@ -2108,13 +2106,13 @@ except ImportError:
");
assert_snapshot!(
test.import_from("foo", "MAGIC"), @r"
import foo
from foo import MAGIC
try:
from foo import MAGIC
except ImportError:
from bar import MAGIC
(foo.MAGIC)
(MAGIC)
");
}

View File

@@ -14,11 +14,11 @@ use crate::find_node::CoveringNode;
use crate::goto::GotoTarget;
use crate::{Db, NavigationTargets, ReferenceKind, ReferenceTarget};
use ruff_db::files::File;
use ruff_python_ast::token::Tokens;
use ruff_python_ast::{
self as ast, AnyNodeRef,
visitor::source_order::{SourceOrderVisitor, TraversalSignal},
};
use ruff_python_parser::Tokens;
use ruff_text_size::{Ranged, TextRange};
use ty_python_semantic::{ImportAliasResolution, SemanticModel};

View File

@@ -11,8 +11,8 @@ use crate::goto::Definitions;
use crate::{Db, find_node::covering_node};
use ruff_db::files::File;
use ruff_db::parsed::parsed_module;
use ruff_python_ast::token::TokenKind;
use ruff_python_ast::{self as ast, AnyNodeRef};
use ruff_python_parser::TokenKind;
use ruff_text_size::{Ranged, TextRange, TextSize};
use ty_python_semantic::ResolvedDefinition;
use ty_python_semantic::SemanticModel;
@@ -381,7 +381,7 @@ mod tests {
f = func_a
else:
f = func_b
f(<CURSOR>
"#,
);
@@ -426,10 +426,10 @@ mod tests {
@overload
def process(value: int) -> str: ...
@overload
def process(value: str) -> int: ...
def process(value):
if isinstance(value, int):
return str(value)
@@ -826,10 +826,10 @@ def ab(a: int, *, c: int):
r#"
class Point:
"""A simple point class representing a 2D coordinate."""
def __init__(self, x: int, y: int):
"""Initialize a point with x and y coordinates.
Args:
x: The x-coordinate
y: The y-coordinate
@@ -961,12 +961,12 @@ def ab(a: int, *, c: int):
r#"
from typing import overload
@overload
@overload
def process(value: int) -> str: ...
@overload
def process(value: str, flag: bool) -> int: ...
def process(value, flag=None):
if isinstance(value, int):
return str(value)

View File

@@ -379,6 +379,16 @@ pub(crate) fn symbols_for_file_global_only(db: &dyn Db, file: File) -> FlatSymbo
global_only: true,
};
visitor.visit_body(&module.syntax().body);
if file
.path(db)
.as_system_path()
.is_none_or(|path| !db.project().is_file_included(db, path))
{
// Eagerly clear ASTs of third party files.
parsed.clear();
}
FlatSymbols {
symbols: visitor.symbols,
}

View File

@@ -7,6 +7,7 @@ pub use self::changes::ChangeResult;
use crate::CollectReporter;
use crate::metadata::settings::file_settings;
use crate::{ProgressReporter, Project, ProjectMetadata};
use get_size2::StandardTracker;
use ruff_db::Db as SourceDb;
use ruff_db::diagnostic::Diagnostic;
use ruff_db::files::{File, Files};
@@ -129,7 +130,10 @@ impl ProjectDatabase {
/// Returns a [`SalsaMemoryDump`] that can be use to dump Salsa memory usage information
/// to the CLI after a typechecker run.
pub fn salsa_memory_dump(&self) -> SalsaMemoryDump {
let memory_usage = <dyn salsa::Database>::memory_usage(self);
let memory_usage = ruff_memory_usage::attach_tracker(StandardTracker::new(), || {
<dyn salsa::Database>::memory_usage(self)
});
let mut ingredients = memory_usage
.structs
.into_iter()

Some files were not shown because too many files have changed in this diff Show More