Compare commits

...

21 Commits

Author SHA1 Message Date
Dhruv Manilawala
756e7b6c82 Store current TokenKind on Parser 2024-03-18 20:17:50 +05:30
Dhruv Manilawala
4f5604fc83 Fix clippy warnings 2024-03-14 13:31:04 +05:30
Dhruv Manilawala
4381629e13 Use recovery context to decide on trailing comma (#10405)
## Summary

This PR removes the `allow_trailing_comma` parameter from list parsing
because it can be inferred using the recovery context kind.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
b09e5f40df Remove Expr::Invalid (#10386)
## Summary

This PR removes the `Expr::Invalid` variant from the AST. Instead, we'll
try to retain as much valid information as possible and use an empty
`Expr::Name` with `ExprContext::Invalid` as a replacement.

## Test Plan

- [x] All tests pass
- [x] No performance regression
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
1ec7259116 Use string flags to mark it as invalid (#10385)
## Summary

This PR updates the string flags to include an `Invalid` variant for any
invalid string nodes as deemed by the parser. This is to avoid dropping
the nodes and instead just mark it as invalid. The nodes will be empty
for now but we can discuss on whether to keep the raw source text or
not. It's not really required because the range can be used to get the
same.

It also adds a new `handle_implicitly_concatenated_strings` method which
is similar to existing `concatenated_strings` function. The reason to
have a separate method is to avoid dropping all strings if there's an
error. The error being that it's concatenating bytes and non-bytes
literal. Now, we need to decide which strings to retain. Currently, I've
kept it simple to retain bytes literal _only_ if all of them are bytes
otherwise we'll have string / f-string with invalid nodes instead of
bytes literal.

This removes the need for having a `StringType::Invalid` variant.

## Test Plan

- [x] Existing test cases pass
- [x] No performance regression
2024-03-14 13:31:04 +05:30
Victor Hugo Gomes
d41ecfe351 Remove f-string UnclosedLbrace error checking from the lexer (#10372)
This error check is already handled by the new parser.

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
156f7994a7 Set new parser for benchmarks 2024-03-14 13:31:04 +05:30
Dhruv Manilawala
559832ba4d Merge string parsing to single entrypoint (#10339)
This PR merges the different string parsing functions into a single
entry point function.

Previously there were two entry points, one for string or byte literal
and the other for f-strings. The reason for this separation is that our
old parser raised a hard syntax error if an f-string was used as a
pattern literal. But, it's actually a soft syntax error as evident
through the CPython parser which raises it during the runtime.

This function basically implements the following grammar:
```
strings: (string|fstring)+
```

And it delegates it to the list parsing for better error recovery.

- [x] All tests pass
- [x] No performance regression
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
98f6dcbb91 Introduce LexicalError::InvalidByteLiteral (#10328)
## Summary

This PR introduces a new `InvalidByteLiteral` lexical error type to
avoid repeating the message in multiple locations.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
94cc5f2e13 Remove FStringElement::Invalid, improve parsing logic (#10327)
This PR does the following around f-string parsing:
1. Removes the `FStringElement::Invalid` variant
2. Move the parsing of f-string elements to use list parsing logic
3. Add error recovery for f-string elements

- [x] All tests pass
- [x] No performance regression
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
cdcbb04686 Fix tests and error recovery token sets (#10338)
This PR fixes the failing CI tests from the original PR. The changes
have been highlighted using PR comments to have the proper context.

`cargo test`
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
70aa19e9af Move parser quick test to pretty print code 2024-03-14 13:31:04 +05:30
Dhruv Manilawala
aac2023999 Remove Pattern::Invalid variant (#10294)
## Summary

This PR removes the `Pattern::Invalid` variant. There are no references
of it in the parser.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
2b00e81c22 Remove skip_until parser method (#10293)
## Summary

This PR removes the `skip_until` parser method. The main use case for it
was for error recovery which we want to isolate only in list parsing.

There are two references which are removed:
1. Parsing a list of match arguments in a class pattern. Take the
following code snippet as an example:

	```python
	match foo:
		case Foo(bar.z=1, baz):
			pass
	```
This is a syntax error as the keyword argument pattern can only have an
identifier but here it's an attribute node. Now, to move on to the next
argument (`baz`), the parser would skip until the end of the argument to
recover. What we will do now is to parse the value as a pattern (per
spec) thus moving the parser ahead and add the node with an empty
identifier.

	The above code will produce the following AST:

	<details><summary><b>AST</b></summary>
	<p>
	
	```rs
	Module(
	    ModModule {
	        range: 0..52,
	        body: [
	            Match(
	                StmtMatch {
	                    range: 0..51,
	                    subject: Name(
	                        ExprName {
	                            range: 6..9,
	                            id: "foo",
	                            ctx: Load,
	                        },
	                    ),
	                    cases: [
	                        MatchCase {
	                            range: 15..51,
	                            pattern: MatchClass(
	                                PatternMatchClass {
	                                    range: 20..37,
	                                    cls: Name(
	                                        ExprName {
	                                            range: 20..23,
	                                            id: "Foo",
	                                            ctx: Load,
	                                        },
	                                    ),
	                                    arguments: PatternArguments {
	                                        range: 24..37,
	                                        patterns: [
	                                            MatchAs(
	                                                PatternMatchAs {
	                                                    range: 33..36,
	                                                    pattern: None,
	                                                    name: Some(
	                                                        Identifier {
	                                                            id: "baz",
range: 33..36,
	                                                        },
	                                                    ),
	                                                },
	                                            ),
	                                        ],
	                                        keywords: [
	                                            PatternKeyword {
	                                                range: 24..31,
	                                                attr: Identifier {
	                                                    id: "",
	                                                    range: 31..31,
	                                                },
	                                                pattern: MatchValue(
	                                                    PatternMatchValue {
	                                                        range: 30..31,
value: NumberLiteral(
ExprNumberLiteral {
range: 30..31,
value: Int(
	                                                                    1,
	                                                                ),
	                                                            },
	                                                        ),
	                                                    },
	                                                ),
	                                            },
	                                        ],
	                                    },
	                                },
	                            ),
	                            guard: None,
	                            body: [
	                                Pass(
	                                    StmtPass {
	                                        range: 47..51,
	                                    },
	                                ),
	                            ],
	                        },
	                    ],
	                },
	            ),
	        ],
	    },
	)
	```
	
	</p>
	</details> 

2. Parsing a list of parameters. Here, our list parsing method makes
sure to only call the parse element function when it's a valid list
element. A parameter can start either with a `Star`, `DoubleStar`, or
`Name` token which corresponds to the 3 `if` conditions. Thus, the
`else` block is not required as the list parsing will recover without
it.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
b65e3fb335 Improve various assignment target error (#10288)
## Summary

This PR improves error related things around assignment nodes, mainly
the following:
1. Rename parse error variant:
	a. `AssignmentError` -> `InvalidAssignmentTarget`
	b. `NamedAssignmentError` -> `InvalidNamedAssignmentTarget`
	c. `AugAssignmentError` -> `InvalidAugmnetedAssignmentTarget`
2. Add `InvalidDeleteTarget` for invalid `del` targets
a. Add helper function to check if it's a valid delete target similar to
other target check functions.
4. Fix: named assignment target can only be a `Name` node

## Test Plan

Various test cases locally. As mentioned in my previous PR, I want to
keep the testing part separate.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
814438777c Remove deprecated parsing list functions (#10271)
## Summary

This PR removes the deprecated parsing list functions and updates the
references to use the new functions.

There are now 4 functions to accommodate this pattern. They are divided
into 2 groups: one to parse a sequence of elements and the other to
parse a sequence of elements _separated_ by a comma. In each of the
groups, there are 2 functions: one collects and returns all the parsed
elements as a vector and the other delegates the collection part to the
user. This separation is achieved by using `Fn` and `FnMut` to allow
mutation in the later case.

The error recovery context has been updated to accommodate the new
sequence kind. Currently, the terminator token kinds only contain the
necessary token to end the list and not necessarily the ones which might
help in error recovery. This will be updated as I go through the testing
phase. This phase is basically coming up with a bunch of invalid
programs to check how the parser is acting and how can we help in the
recovery phase.


## Test Plan

Currently, my plan is to keep the testing part separate than the actual
update. This doesn't mean I'm not testing locally, but it's not
thorough. The main reason is to keep the diffs to a minimal and writing
test cases will require some effort which I want to decouple with the
actual change. This is ok here as it's not getting merged into `main`
but the parser PR.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
f6467216dc Rename to include "token" in method name (#10287)
Small quality of life improvement to rename the following method:
1. `current_kind` -> `current_token_kind`
2. `current_range` -> `current_token_range`

It's a PR for visibility.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
bf67f129dd Encapsulate Program fields (#10270)
## Summary

This PR updates the fields in `Program` struct to be private and exposes
methods to get the values. The motivation behind this is to encapsulate
the internal representation of the parsed program which we could alter
in the future.
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
3ee670440c Assert the parser is at augmented assign token (#10269)
## Summary

This PR updates fixes one of the `FIXME` comment to assert that the
parser is at one of the possible augmented assignment token when parsing
an augmented assignment statement.

## Test Plan

1. Add valid test cases for all the possible augmented assignment tokens
2. Add invalid test cases similar to assignment statement
2024-03-14 13:31:04 +05:30
Dhruv Manilawala
0de3f2f92d Fix tests and clippy warnings 2024-03-14 13:31:04 +05:30
Victor Hugo Gomes
f4a8ab8756 Replace LALRPOP parser with hand-written parser
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-03-14 13:31:04 +05:30
268 changed files with 40258 additions and 6202 deletions

3
Cargo.lock generated
View File

@@ -2353,9 +2353,11 @@ dependencies = [
name = "ruff_python_parser"
version = "0.0.0"
dependencies = [
"annotate-snippets 0.9.2",
"anyhow",
"bitflags 2.4.2",
"bstr",
"drop_bomb",
"insta",
"is-macro",
"itertools 0.12.1",
@@ -2363,6 +2365,7 @@ dependencies = [
"lalrpop-util",
"memchr",
"ruff_python_ast",
"ruff_source_file",
"ruff_text_size",
"rustc-hash",
"static_assertions",

View File

@@ -523,7 +523,7 @@ from module import =
----- stdout -----
----- stderr -----
error: Failed to parse main.py:2:20: Unexpected token '='
error: Failed to parse main.py:2:20: Unexpected token =
"###);
Ok(())

View File

@@ -731,11 +731,11 @@ fn stdin_parse_error() {
success: false
exit_code: 1
----- stdout -----
-:1:17: E999 SyntaxError: Unexpected token '='
-:1:17: E999 SyntaxError: Unexpected token =
Found 1 error.
----- stderr -----
error: Failed to parse at 1:17: Unexpected token '='
error: Failed to parse at 1:17: Unexpected token =
"###);
}

View File

@@ -52,6 +52,7 @@ file_resolver.exclude = [
file_resolver.extend_exclude = [
"crates/ruff_linter/resources/",
"crates/ruff_python_formatter/resources/",
"crates/ruff_python_parser/resources/",
]
file_resolver.force_exclude = false
file_resolver.include = [

View File

@@ -7,7 +7,7 @@ use ruff_benchmark::{TestCase, TestFile, TestFileDownloadError};
use ruff_python_formatter::{format_module_ast, PreviewMode, PyFormatOptions};
use ruff_python_index::CommentRangesBuilder;
use ruff_python_parser::lexer::lex;
use ruff_python_parser::{allocate_tokens_vec, parse_tokens, Mode};
use ruff_python_parser::{allocate_tokens_vec, parse_tokens, set_new_parser, Mode};
#[cfg(target_os = "windows")]
#[global_allocator]
@@ -42,6 +42,8 @@ fn create_test_cases() -> Result<Vec<TestCase>, TestFileDownloadError> {
}
fn benchmark_formatter(criterion: &mut Criterion) {
set_new_parser(true);
let mut group = criterion.benchmark_group("formatter");
let test_cases = create_test_cases().unwrap();

View File

@@ -2,7 +2,7 @@ use ruff_benchmark::criterion::{
criterion_group, criterion_main, measurement::WallTime, BenchmarkId, Criterion, Throughput,
};
use ruff_benchmark::{TestCase, TestFile, TestFileDownloadError};
use ruff_python_parser::{lexer, Mode};
use ruff_python_parser::{lexer, set_new_parser, Mode};
#[cfg(target_os = "windows")]
#[global_allocator]
@@ -37,6 +37,8 @@ fn create_test_cases() -> Result<Vec<TestCase>, TestFileDownloadError> {
}
fn benchmark_lexer(criterion: &mut Criterion<WallTime>) {
set_new_parser(true);
let test_cases = create_test_cases().unwrap();
let mut group = criterion.benchmark_group("lexer");

View File

@@ -10,7 +10,7 @@ use ruff_linter::settings::{flags, LinterSettings};
use ruff_linter::source_kind::SourceKind;
use ruff_linter::{registry::Rule, RuleSelector};
use ruff_python_ast::PySourceType;
use ruff_python_parser::{lexer, parse_program_tokens, Mode};
use ruff_python_parser::{lexer, parse_program_tokens, set_new_parser, Mode};
#[cfg(target_os = "windows")]
#[global_allocator]
@@ -45,6 +45,8 @@ fn create_test_cases() -> Result<Vec<TestCase>, TestFileDownloadError> {
}
fn benchmark_linter(mut group: BenchmarkGroup, settings: &LinterSettings) {
set_new_parser(true);
let test_cases = create_test_cases().unwrap();
for case in test_cases {

View File

@@ -4,7 +4,7 @@ use ruff_benchmark::criterion::{
use ruff_benchmark::{TestCase, TestFile, TestFileDownloadError};
use ruff_python_ast::statement_visitor::{walk_stmt, StatementVisitor};
use ruff_python_ast::Stmt;
use ruff_python_parser::parse_suite;
use ruff_python_parser::{parse_suite, set_new_parser};
#[cfg(target_os = "windows")]
#[global_allocator]
@@ -50,6 +50,8 @@ impl<'a> StatementVisitor<'a> for CountVisitor {
}
fn benchmark_parser(criterion: &mut Criterion<WallTime>) {
set_new_parser(true);
let test_cases = create_test_cases().unwrap();
let mut group = criterion.benchmark_group("parser");

View File

@@ -254,7 +254,7 @@ pub(crate) fn expression(expr: &Expr, checker: &mut Checker) {
}
}
}
ExprContext::Del => {}
_ => {}
}
if checker.enabled(Rule::SixPY3) {
flake8_2020::rules::name_or_attribute(checker, expr);

View File

@@ -986,6 +986,7 @@ impl<'a> Visitor<'a> for Checker<'a> {
ExprContext::Load => self.handle_node_load(expr),
ExprContext::Store => self.handle_node_store(id, expr),
ExprContext::Del => self.handle_node_delete(expr),
ExprContext::Invalid => {}
},
_ => {}
}

View File

@@ -194,7 +194,7 @@ impl DisplayParseError {
// Translate the byte offset to a location in the originating source.
let location =
if let Some(jupyter_index) = source_kind.as_ipy_notebook().map(Notebook::index) {
let source_location = source_code.source_location(error.offset);
let source_location = source_code.source_location(error.location.start());
ErrorLocation::Cell(
jupyter_index
@@ -208,7 +208,7 @@ impl DisplayParseError {
},
)
} else {
ErrorLocation::File(source_code.source_location(error.offset))
ErrorLocation::File(source_code.source_location(error.location.start()))
};
Self {
@@ -275,27 +275,7 @@ impl<'a> DisplayParseErrorType<'a> {
impl Display for DisplayParseErrorType<'_> {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
match self.0 {
ParseErrorType::Eof => write!(f, "Expected token but reached end of file."),
ParseErrorType::ExtraToken(ref tok) => write!(
f,
"Got extraneous token: {tok}",
tok = TruncateAtNewline(&tok)
),
ParseErrorType::InvalidToken => write!(f, "Got invalid token"),
ParseErrorType::UnrecognizedToken(ref tok, ref expected) => {
if let Some(expected) = expected.as_ref() {
write!(
f,
"Expected '{expected}', but got {tok}",
tok = TruncateAtNewline(&tok)
)
} else {
write!(f, "Unexpected token {tok}", tok = TruncateAtNewline(&tok))
}
}
ParseErrorType::Lexical(ref error) => write!(f, "{error}"),
}
write!(f, "{}", TruncateAtNewline(&self.0))
}
}

View File

@@ -67,7 +67,7 @@ impl<'a> Visitor<'a> for LoadedNamesVisitor<'a> {
Expr::Name(name) => match &name.ctx {
ExprContext::Load => self.loaded.push(name),
ExprContext::Store => self.stored.push(name),
ExprContext::Del => {}
_ => {}
},
_ => visitor::walk_expr(self, expr),
}

View File

@@ -14,5 +14,3 @@ bom_unsorted.py:1:1: I001 [*] Import block is un-sorted or un-formatted
2 |-import bar
1 |+import bar
2 |+import foo

View File

@@ -81,7 +81,7 @@ pub(crate) fn syntax_error(
parse_error: &ParseError,
locator: &Locator,
) {
let rest = locator.after(parse_error.offset);
let rest = locator.after(parse_error.location.start());
// Try to create a non-empty range so that the diagnostic can print a caret at the
// right position. This requires that we retrieve the next character, if any, and take its length
@@ -95,6 +95,6 @@ pub(crate) fn syntax_error(
SyntaxError {
message: format!("{}", DisplayParseErrorType::new(&parse_error.error)),
},
TextRange::at(parse_error.offset, len),
TextRange::at(parse_error.location.start(), len),
));
}

View File

@@ -8,5 +8,3 @@ E999.py:3:1: E999 SyntaxError: unindent does not match any outer indentation lev
| ^ E999
4 |
|

View File

@@ -110,5 +110,3 @@ UP027.py:10:17: UP027 [*] Replace unpacked list comprehension with a generator e
14 14 |
15 15 | # Should not change
16 16 | foo = [fn(x) for x in items]

View File

@@ -234,6 +234,7 @@ pub enum ComparablePattern<'a> {
MatchStar(PatternMatchStar<'a>),
MatchAs(PatternMatchAs<'a>),
MatchOr(PatternMatchOr<'a>),
Invalid,
}
impl<'a> From<&'a ast::Pattern> for ComparablePattern<'a> {
@@ -864,6 +865,7 @@ pub enum ComparableExpr<'a> {
Tuple(ExprTuple<'a>),
Slice(ExprSlice<'a>),
IpyEscapeCommand(ExprIpyEscapeCommand<'a>),
Invalid,
}
impl<'a> From<&'a Box<ast::Expr>> for Box<ComparableExpr<'a>> {

View File

@@ -1602,7 +1602,7 @@ mod tests {
fn any_over_stmt_type_alias() {
let seen = RefCell::new(Vec::new());
let name = Expr::Name(ExprName {
id: "x".to_string(),
id: "x".into(),
range: TextRange::default(),
ctx: ExprContext::Load,
});

View File

@@ -1,6 +1,7 @@
#![allow(clippy::derive_partial_eq_without_eq)]
use std::cell::OnceCell;
use std::fmt;
use std::fmt::Debug;
use std::ops::Deref;
@@ -947,12 +948,19 @@ impl Ranged for FStringExpressionElement {
}
}
/// An `FStringLiteralElement` with an empty `value` is an invalid f-string element.
#[derive(Clone, Debug, PartialEq)]
pub struct FStringLiteralElement {
pub range: TextRange,
pub value: Box<str>,
}
impl FStringLiteralElement {
pub fn is_valid(&self) -> bool {
!self.value.is_empty()
}
}
impl Ranged for FStringLiteralElement {
fn range(&self) -> TextRange {
self.range
@@ -1501,6 +1509,9 @@ bitflags! {
/// The string has an `r` or `R` prefix, meaning it is a raw string.
/// It is invalid to set this flag if `U_PREFIX` is also set.
const R_PREFIX = 1 << 3;
/// The string was deemed invalid by the parser.
const INVALID = 1 << 4;
}
}
@@ -1532,6 +1543,12 @@ impl StringLiteralFlags {
self
}
#[must_use]
pub fn with_invalid(mut self) -> Self {
self.0 |= StringLiteralFlagsInner::INVALID;
self
}
pub const fn prefix(self) -> &'static str {
if self.0.contains(StringLiteralFlagsInner::U_PREFIX) {
debug_assert!(!self.0.contains(StringLiteralFlagsInner::R_PREFIX));
@@ -1626,6 +1643,15 @@ impl StringLiteral {
pub fn as_str(&self) -> &str {
self
}
/// Creates an invalid string literal with the given range.
pub fn invalid(range: TextRange) -> Self {
Self {
range,
value: "".into(),
flags: StringLiteralFlags::default().with_invalid(),
}
}
}
impl From<StringLiteral> for Expr {
@@ -1834,6 +1860,9 @@ bitflags! {
/// The bytestring has an `r` or `R` prefix, meaning it is a raw bytestring.
const R_PREFIX = 1 << 3;
/// The bytestring was deemed invalid by the parser.
const INVALID = 1 << 4;
}
}
@@ -1861,6 +1890,12 @@ impl BytesLiteralFlags {
self
}
#[must_use]
pub fn with_invalid(mut self) -> Self {
self.0 |= BytesLiteralFlagsInner::INVALID;
self
}
/// Does the bytestring have an `r` or `R` prefix?
pub const fn is_raw(self) -> bool {
self.0.contains(BytesLiteralFlagsInner::R_PREFIX)
@@ -1920,6 +1955,15 @@ impl BytesLiteral {
pub fn as_slice(&self) -> &[u8] {
self
}
/// Creates a new invalid bytes literal with the given range.
pub fn invalid(range: TextRange) -> Self {
Self {
range,
value: Box::new([]),
flags: BytesLiteralFlags::default().with_invalid(),
}
}
}
impl From<BytesLiteral> for Expr {
@@ -2119,6 +2163,7 @@ pub enum ExprContext {
Load,
Store,
Del,
Invalid,
}
impl ExprContext {
#[inline]
@@ -3544,10 +3589,17 @@ impl IpyEscapeKind {
}
}
/// An `Identifier` with an empty `id` is invalid.
///
/// For example, in the following code `id` will be empty.
/// ```python
/// def 1():
/// ...
/// ```
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub struct Identifier {
id: String,
range: TextRange,
pub id: String,
pub range: TextRange,
}
impl Identifier {
@@ -3558,6 +3610,10 @@ impl Identifier {
range,
}
}
pub fn is_valid(&self) -> bool {
!self.id.is_empty()
}
}
impl Identifier {

View File

@@ -135,7 +135,7 @@ pub fn format_module_source(
let source_type = options.source_type();
let (tokens, comment_ranges) =
tokens_and_ranges(source, source_type).map_err(|err| ParseError {
offset: err.location(),
location: err.location(),
error: ParseErrorType::Lexical(err.into_error()),
})?;
let module = parse_tokens(tokens, source, source_type.as_mode())?;

View File

@@ -73,7 +73,7 @@ pub fn format_range(
let (tokens, comment_ranges) =
tokens_and_ranges(source, options.source_type()).map_err(|err| ParseError {
offset: err.location(),
location: err.location(),
error: ParseErrorType::Lexical(err.into_error()),
})?;

View File

@@ -13,6 +13,3 @@ input_file: crates/ruff_python_formatter/resources/test/fixtures/ruff/empty_mult
```python
```

View File

@@ -11,6 +11,3 @@ input_file: crates/ruff_python_formatter/resources/test/fixtures/ruff/empty_trai
```python
```

View File

@@ -19,6 +19,7 @@ ruff_text_size = { path = "../ruff_text_size" }
anyhow = { workspace = true }
bitflags = { workspace = true }
drop_bomb = { workspace = true }
bstr = { workspace = true }
is-macro = { workspace = true }
itertools = { workspace = true }
@@ -30,7 +31,10 @@ unicode-ident = { workspace = true }
unicode_names2 = { workspace = true }
[dev-dependencies]
insta = { workspace = true }
ruff_source_file = { path = "../ruff_source_file" }
annotate-snippets = { workspace = true }
insta = { workspace = true, features = ["glob"] }
[build-dependencies]
anyhow = { workspace = true }

View File

@@ -5,7 +5,7 @@ use std::path::{Path, PathBuf};
use tiny_keccak::{Hasher, Sha3};
fn main() {
const SOURCE: &str = "src/python.lalrpop";
const SOURCE: &str = "src/lalrpop/python.lalrpop";
println!("cargo:rerun-if-changed={SOURCE}");
let target;
@@ -14,12 +14,12 @@ fn main() {
#[cfg(feature = "lalrpop")]
{
let out_dir = PathBuf::from(std::env::var_os("OUT_DIR").unwrap());
target = out_dir.join("src/python.rs");
target = out_dir.join("src/lalrpop/python.rs");
}
#[cfg(not(feature = "lalrpop"))]
{
target = PathBuf::from("src/python.rs");
error = "python.lalrpop and src/python.rs doesn't match. This is a ruff_python_parser bug. Please report it unless you are editing ruff_python_parser. Run `lalrpop src/python.lalrpop` to build ruff_python_parser again.";
target = PathBuf::from("src/lalrpop/python.rs");
error = "python.lalrpop and src/lalrpop/python.rs doesn't match. This is a ruff_python_parser bug. Please report it unless you are editing ruff_python_parser. Run `lalrpop src/lalrpop/python.lalrpop` to build ruff_python_parser again.";
}
let Some(message) = requires_lalrpop(SOURCE, &target) else {

View File

@@ -0,0 +1,6 @@
# Check http://editorconfig.org for more information
# This is the main config file for this project:
root = true
[*.py]
insert_final_newline = false

View File

@@ -0,0 +1,6 @@
f(b=20, c)
f(**b, *c)
# Duplicate keyword argument
f(a=20, a=30)

View File

@@ -0,0 +1,7 @@
a = (🐶
# comment 🐶
)
a = (🐶 +
# comment
🐶)

View File

@@ -0,0 +1,2 @@
# TODO(micha): The offset of the generated error message is off by one.
lambda a, b=20, c: 1

View File

@@ -0,0 +1,9 @@
lambda a, a: 1
lambda a, *, a: 1
lambda a, a=20: 1
lambda a, *a: 1
lambda a, *, **a: 1

View File

@@ -0,0 +1,8 @@
# TODO(dhruvmanila): Remove the dummy test case and uncomment the others when this is fixed. See PR #10372
x +
# f'{'
# f'{foo!r'
# f'{foo='
# f"{"
# f"""{"""

View File

@@ -0,0 +1,7 @@
a = pass = c
a + b
a = b = pass = c
a + b

View File

@@ -0,0 +1,7 @@
a = = c
a + b

View File

@@ -0,0 +1,2 @@
# TODO(micha): The range of the generated error message is off by one.
def f(a, b=20, c): pass

View File

@@ -0,0 +1,12 @@
def f(a, a): pass
def f2(a, *, a): pass
def f3(a, a=20): pass
def f4(a, *a): pass
def f5(a, *, **a): pass
# TODO(micha): This is inconsistent. All other examples only highlight the argument name.
def f6(a, a: str): pass

View File

@@ -0,0 +1,24 @@
# FIXME: The type param related error message and the parser recovery are looking pretty good **except**
# that the lexer never recovers from the unclosed `[`, resulting in it lexing `NonLogicalNewline` tokens instead of `Newline` tokens.
# That's because the parser has no way of feeding the error recovery back to the lexer,
# so they don't agree on the state of the world which can lead to all kind of errors further down in the file.
# This is not just a problem with parentheses but also with the transformation made by the
# `SoftKeywordTransformer` because the `Parser` and `Transfomer` may not agree if they're
# currently in a position where the `type` keyword is allowed or not.
# That roughly means that any kind of recovery can lead to unrelated syntax errors
# on following lines.
def unclosed[A, *B(test: name):
pass
a + b
def keyword[A, await](): ...
def not_a_type_param[A, |, B](): ...
def multiple_commas[A,,B](): ...
def multiple_trailing_commas[A,,](): ...
def multiple_commas_and_recovery[A,,100](): ...

View File

@@ -0,0 +1,7 @@
if True:
pass
elif False:
pass
elf:
pass

View File

@@ -0,0 +1,3 @@
# FIXME(micha): This creates two syntax errors instead of just one (and overlapping ones)
if True)):
pass

View File

@@ -0,0 +1,8 @@
# Improving the recovery would require changing the lexer to emit an extra dedent token after `a + b`.
if True:
pass
a + b
pass
a = 10

View File

@@ -0,0 +1,6 @@
if True:
a + b
if False: # This if statement has neither an indent nor a newline.

View File

@@ -0,0 +1,4 @@
if True
pass
a = 10

View File

@@ -0,0 +1,11 @@
from abc import a, b,
a + b
from abc import ,,
from abc import
from abc import (a, b, c
a + b

View File

@@ -0,0 +1,42 @@
# Regression test: https://github.com/astral-sh/ruff/issues/6895
# First we test, broadly, that various kinds of assignments are now
# rejected by the parser. e.g., `5 = 3`, `5 += 3`, `(5): int = 3`.
5 = 3
5 += 3
(5): int = 3
# Now we exhaustively test all possible cases where assignment can fail.
x or y = 42
(x := 5) = 42
x + y = 42
-x = 42
(lambda _: 1) = 42
a if b else c = 42
{"a": 5} = 42
{a} = 42
[x for x in xs] = 42
{x for x in xs} = 42
{x: x * 2 for x in xs} = 42
(x for x in xs) = 42
await x = 42
(yield x) = 42
(yield from xs) = 42
a < b < c = 42
foo() = 42
f"{quux}" = 42
f"{foo} and {bar}" = 42
"foo" = 42
b"foo" = 42
123 = 42
True = 42
None = 42
... = 42
*foo() = 42
[x, foo(), y] = [42, 42, 42]
[[a, b], [[42]], d] = [[1, 2], [[3]], 4]
(x, foo(), y) = (42, 42, 42)

View File

@@ -0,0 +1,34 @@
# This is similar to `./invalid_assignment_targets.py`, but for augmented
# assignment targets.
x or y += 42
(x := 5) += 42
x + y += 42
-x += 42
(lambda _: 1) += 42
a if b else c += 42
{"a": 5} += 42
{a} += 42
[x for x in xs] += 42
{x for x in xs} += 42
{x: x * 2 for x in xs} += 42
(x for x in xs) += 42
await x += 42
(yield x) += 42
(yield from xs) += 42
a < b < c += 42
foo() += 42
f"{quux}" += 42
f"{foo} and {bar}" += 42
"foo" += 42
b"foo" += 42
123 += 42
True += 42
None += 42
... += 42
*foo() += 42
[x, foo(), y] += [42, 42, 42]
[[a, b], [[42]], d] += [[1, 2], [[3]], 4]
(x, foo(), y) += (42, 42, 42)

View File

@@ -0,0 +1,3 @@
# This test previously passed before the assignment operator checking
# above, but we include it here for good measure.
(5 := 3)

View File

@@ -0,0 +1 @@
x = {y for y in (1, 2, 3)}

View File

@@ -0,0 +1,11 @@
lambda: 1
lambda a, b, c: 1
lambda a, b=20, c=30: 1
lambda *, a, b, c: 1
lambda *, a, b=20, c=30: 1
lambda a, b, c, *, d, e: 0

View File

@@ -0,0 +1 @@
x = [y for y in (1, 2, 3)]

View File

@@ -0,0 +1,4 @@
if x := 1:
pass
(x := 5)

View File

@@ -0,0 +1 @@
x: int = 1

View File

@@ -0,0 +1,40 @@
x = (1, 2, 3)
(x, y) = (1, 2, 3)
[x, y] = (1, 2, 3)
x.y = (1, 2, 3)
x[y] = (1, 2, 3)
(x, *y) = (1, 2, 3)
# This last group of tests checks that assignments we expect to be parsed
# (including some interesting ones) continue to be parsed successfully.
*foo = 42
[x, y, z] = [1, 2, 3]
(x, y, z) = (1, 2, 3)
x[0] = 42
# This is actually a type error, not a syntax error. So check that it
# doesn't fail parsing.
5[0] = 42
x[1:2] = [42]
# This is actually a type error, not a syntax error. So check that it
# doesn't fail parsing.
5[1:2] = [42]
foo.bar = 42
# This is actually an attribute error, not a syntax error. So check that
# it doesn't fail parsing.
"foo".y = 42
foo = 42

View File

@@ -0,0 +1,18 @@
x += 1
x.y += (1, 2, 3)
x[y] += (1, 2, 3)
# All possible augmented assignment tokens
x += 1
x -= 1
x *= 1
x /= 1
x //= 1
x %= 1
x **= 1
x &= 1
x |= 1
x ^= 1
x <<= 1
x >>= 1
x @= 1

View File

@@ -0,0 +1,3 @@
del x
del x.y
del x[y]

View File

@@ -0,0 +1,2 @@
for x in (1, 2, 3):
pass

View File

@@ -0,0 +1,38 @@
def no_parameters():
pass
def positional_parameters(a, b, c):
pass
def positional_parameters_with_default_values(a, b=20, c=30):
pass
def keyword_only_parameters(*, a, b, c):
pass
def keyword_only_parameters_with_defaults(*, a, b=20, c=30):
pass
def positional_and_keyword_parameters(a, b, c, *, d, e, f):
pass
def positional_and_keyword_parameters_with_defaults(a, b, c, *, d, e=20, f=30):
pass
def positional_and_keyword_parameters_with_defaults_and_varargs(
a, b, c, *args, d, e=20, f=30
):
pass
def positional_and_keyword_parameters_with_defaults_and_varargs_and_kwargs(
a, b, c, *args, d, e=20, f=30, **kwargs
):
pass

View File

@@ -0,0 +1,2 @@
with 1 as x:
pass

View File

@@ -0,0 +1,221 @@
use std::fmt;
use ruff_text_size::TextRange;
use crate::{
lexer::{LexicalError, LexicalErrorType},
Tok, TokenKind,
};
/// Represents represent errors that occur during parsing and are
/// returned by the `parse_*` functions.
#[derive(Debug, PartialEq)]
pub struct ParseError {
pub error: ParseErrorType,
pub location: TextRange,
}
impl std::ops::Deref for ParseError {
type Target = ParseErrorType;
fn deref(&self) -> &Self::Target {
&self.error
}
}
impl std::error::Error for ParseError {
fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
Some(&self.error)
}
}
impl fmt::Display for ParseError {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "{} at byte range {:?}", &self.error, self.location)
}
}
impl From<LexicalError> for ParseError {
fn from(error: LexicalError) -> Self {
ParseError {
location: error.location(),
error: ParseErrorType::Lexical(error.into_error()),
}
}
}
impl ParseError {
pub fn error(self) -> ParseErrorType {
self.error
}
}
/// Represents the different types of errors that can occur during parsing of an f-string.
#[derive(Debug, Clone, PartialEq)]
pub enum FStringErrorType {
/// Expected a right brace after an opened left brace.
UnclosedLbrace,
/// An invalid conversion flag was encountered.
InvalidConversionFlag,
/// A single right brace was encountered.
SingleRbrace,
/// Unterminated string.
UnterminatedString,
/// Unterminated triple-quoted string.
UnterminatedTripleQuotedString,
/// A lambda expression without parentheses was encountered.
LambdaWithoutParentheses,
}
impl std::fmt::Display for FStringErrorType {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
use FStringErrorType::{
InvalidConversionFlag, LambdaWithoutParentheses, SingleRbrace, UnclosedLbrace,
UnterminatedString, UnterminatedTripleQuotedString,
};
match self {
UnclosedLbrace => write!(f, "expecting '}}'"),
InvalidConversionFlag => write!(f, "invalid conversion character"),
SingleRbrace => write!(f, "single '}}' is not allowed"),
UnterminatedString => write!(f, "unterminated string"),
UnterminatedTripleQuotedString => write!(f, "unterminated triple-quoted string"),
LambdaWithoutParentheses => {
write!(f, "lambda expressions are not allowed without parentheses")
}
}
}
}
/// Represents the different types of errors that can occur during parsing.
#[derive(Debug, PartialEq)]
pub enum ParseErrorType {
/// An unexpected error occurred.
OtherError(String),
/// An empty slice was found during parsing, e.g `l[]`.
EmptySlice,
/// An invalid expression was found in the assignment `target`.
InvalidAssignmentTarget,
/// An invalid expression was found in the named assignment `target`.
InvalidNamedAssignmentTarget,
/// An invalid expression was found in the augmented assignment `target`.
InvalidAugmentedAssignmentTarget,
/// An invalid expression was found in the delete `target`.
InvalidDeleteTarget,
/// Multiple simple statements were found in the same line without a `;` separating them.
SimpleStmtsInSameLine,
/// An unexpected indentation was found during parsing.
UnexpectedIndentation,
/// The statement being parsed cannot be `async`.
StmtIsNotAsync(TokenKind),
/// A parameter was found after a vararg
ParamFollowsVarKeywordParam,
/// A positional argument follows a keyword argument.
PositionalArgumentError,
/// An iterable argument unpacking `*args` follows keyword argument unpacking `**kwargs`.
UnpackedArgumentError,
/// A non-default argument follows a default argument.
DefaultArgumentError,
/// A simple statement and a compound statement was found in the same line.
SimpleStmtAndCompoundStmtInSameLine,
/// An invalid `match` case pattern was found.
InvalidMatchPatternLiteral { pattern: TokenKind },
/// The parser expected a specific token that was not found.
ExpectedToken {
expected: TokenKind,
found: TokenKind,
},
/// A duplicate argument was found in a function definition.
DuplicateArgumentError(String),
/// A keyword argument was repeated.
DuplicateKeywordArgumentError(String),
/// An f-string error containing the [`FStringErrorType`].
FStringError(FStringErrorType),
/// Parser encountered an error during lexing.
Lexical(LexicalErrorType),
// RustPython specific.
/// Parser encountered an extra token
ExtraToken(Tok),
/// Parser encountered an invalid token
InvalidToken,
/// Parser encountered an unexpected token
UnrecognizedToken(Tok, Option<String>),
/// Parser encountered an unexpected end of input
Eof,
}
impl std::error::Error for ParseErrorType {}
impl std::fmt::Display for ParseErrorType {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self {
ParseErrorType::OtherError(msg) => write!(f, "{msg}"),
ParseErrorType::ExpectedToken { found, expected } => {
write!(f, "expected {expected:?}, found {found:?}")
}
ParseErrorType::Lexical(ref lex_error) => write!(f, "{lex_error}"),
ParseErrorType::SimpleStmtsInSameLine => {
write!(f, "use `;` to separate simple statements")
}
ParseErrorType::SimpleStmtAndCompoundStmtInSameLine => write!(
f,
"compound statements not allowed in the same line as simple statements"
),
ParseErrorType::StmtIsNotAsync(kind) => {
write!(f, "`{kind:?}` statement cannot be async")
}
ParseErrorType::UnpackedArgumentError => {
write!(
f,
"iterable argument unpacking follows keyword argument unpacking"
)
}
ParseErrorType::PositionalArgumentError => {
write!(f, "positional argument follows keyword argument unpacking")
}
ParseErrorType::EmptySlice => write!(f, "slice cannot be empty"),
ParseErrorType::ParamFollowsVarKeywordParam => {
write!(f, "parameters cannot follow var-keyword parameter")
}
ParseErrorType::DefaultArgumentError => {
write!(f, "non-default argument follows default argument")
}
ParseErrorType::InvalidMatchPatternLiteral { pattern } => {
write!(f, "invalid pattern `{pattern:?}`")
}
ParseErrorType::UnexpectedIndentation => write!(f, "unexpected indentation"),
ParseErrorType::InvalidAssignmentTarget => write!(f, "invalid assignment target"),
ParseErrorType::InvalidNamedAssignmentTarget => {
write!(f, "invalid named assignment target")
}
ParseErrorType::InvalidAugmentedAssignmentTarget => {
write!(f, "invalid augmented assignment target")
}
ParseErrorType::InvalidDeleteTarget => {
write!(f, "invalid delete target")
}
ParseErrorType::DuplicateArgumentError(arg_name) => {
write!(f, "duplicate argument '{arg_name}' in function definition")
}
ParseErrorType::DuplicateKeywordArgumentError(arg_name) => {
write!(f, "keyword argument repeated: {arg_name}")
}
ParseErrorType::FStringError(ref fstring_error) => {
write!(f, "f-string: {fstring_error}")
}
// RustPython specific.
ParseErrorType::Eof => write!(f, "Got unexpected EOF"),
ParseErrorType::ExtraToken(ref tok) => write!(f, "Got extraneous token: {tok:?}"),
ParseErrorType::InvalidToken => write!(f, "Got invalid token"),
ParseErrorType::UnrecognizedToken(ref tok, ref expected) => {
if *tok == Tok::Indent {
write!(f, "Unexpected indent")
} else if expected.as_deref() == Some("Indent") {
write!(f, "Expected an indented block")
} else {
write!(f, "Unexpected token {tok}")
}
}
}
}
}

View File

@@ -1,699 +0,0 @@
/*!
Defines some helper routines for rejecting invalid Python programs.
These routines are named in a way that supports qualified use. For example,
`invalid::assignment_targets`.
*/
use {ruff_python_ast::Expr, ruff_text_size::TextSize};
use crate::lexer::{LexicalError, LexicalErrorType};
/// Returns an error for invalid assignment targets.
///
/// # Errors
///
/// This returns an error when any of the given expressions are themselves
/// or contain an expression that is invalid on the left hand side of an
/// assignment. For example, all literal expressions are invalid assignment
/// targets.
pub(crate) fn assignment_targets(targets: &[Expr]) -> Result<(), LexicalError> {
for t in targets {
assignment_target(t)?;
}
Ok(())
}
/// Returns an error if the given target is invalid for the left hand side of
/// an assignment.
///
/// # Errors
///
/// This returns an error when the given expression is itself or contains an
/// expression that is invalid on the left hand side of an assignment. For
/// example, all literal expressions are invalid assignment targets.
pub(crate) fn assignment_target(target: &Expr) -> Result<(), LexicalError> {
// Allowing a glob import here because of its limited scope.
#[allow(clippy::enum_glob_use)]
use self::Expr::*;
let err = |location: TextSize| -> LexicalError {
let error = LexicalErrorType::AssignmentError;
LexicalError::new(error, location)
};
match *target {
BoolOp(ref e) => Err(err(e.range.start())),
Named(ref e) => Err(err(e.range.start())),
BinOp(ref e) => Err(err(e.range.start())),
UnaryOp(ref e) => Err(err(e.range.start())),
Lambda(ref e) => Err(err(e.range.start())),
If(ref e) => Err(err(e.range.start())),
Dict(ref e) => Err(err(e.range.start())),
Set(ref e) => Err(err(e.range.start())),
ListComp(ref e) => Err(err(e.range.start())),
SetComp(ref e) => Err(err(e.range.start())),
DictComp(ref e) => Err(err(e.range.start())),
Generator(ref e) => Err(err(e.range.start())),
Await(ref e) => Err(err(e.range.start())),
Yield(ref e) => Err(err(e.range.start())),
YieldFrom(ref e) => Err(err(e.range.start())),
Compare(ref e) => Err(err(e.range.start())),
Call(ref e) => Err(err(e.range.start())),
// FString is recursive, but all its forms are invalid as an
// assignment target, so we can reject it without exploring it.
FString(ref e) => Err(err(e.range.start())),
StringLiteral(ref e) => Err(err(e.range.start())),
BytesLiteral(ref e) => Err(err(e.range.start())),
NumberLiteral(ref e) => Err(err(e.range.start())),
BooleanLiteral(ref e) => Err(err(e.range.start())),
NoneLiteral(ref e) => Err(err(e.range.start())),
EllipsisLiteral(ref e) => Err(err(e.range.start())),
// This isn't in the Python grammar but is Jupyter notebook specific.
// It seems like this should be an error. It does also seem like the
// parser prevents this from ever appearing as an assignment target
// anyway. ---AG
IpyEscapeCommand(ref e) => Err(err(e.range.start())),
// The only nested expressions allowed as an assignment target
// are star exprs, lists and tuples.
Starred(ref e) => assignment_target(&e.value),
List(ref e) => assignment_targets(&e.elts),
Tuple(ref e) => assignment_targets(&e.elts),
// Subscript is recursive and can be invalid, but aren't syntax errors.
// For example, `5[1] = 42` is a type error.
Subscript(_) => Ok(()),
// Similar to Subscript, e.g., `5[1:2] = [42]` is a type error.
Slice(_) => Ok(()),
// Similar to Subscript, e.g., `"foo".y = 42` is an attribute error.
Attribute(_) => Ok(()),
// These are always valid as assignment targets.
Name(_) => Ok(()),
}
}
#[cfg(test)]
mod tests {
use crate::parse_suite;
// First we test, broadly, that various kinds of assignments are now
// rejected by the parser. e.g., `5 = 3`, `5 += 3`, `(5): int = 3`.
// Regression test: https://github.com/astral-sh/ruff/issues/6895
#[test]
fn err_literal_assignment() {
let ast = parse_suite(r"5 = 3");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
// This test previously passed before the assignment operator checking
// above, but we include it here for good measure.
#[test]
fn err_assignment_expr() {
let ast = parse_suite(r"(5 := 3)");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: UnrecognizedToken(
ColonEqual,
None,
),
offset: 3,
},
)
"###);
}
#[test]
fn err_literal_augment_assignment() {
let ast = parse_suite(r"5 += 3");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_literal_annotation_assignment() {
let ast = parse_suite(r"(5): int = 3");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 1,
},
)
"###);
}
// Now we exhaustively test all possible cases where assignment can fail.
#[test]
fn err_bool_op() {
let ast = parse_suite(r"x or y = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_named_expr() {
let ast = parse_suite(r"(x := 5) = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 1,
},
)
"###);
}
#[test]
fn err_bin_op() {
let ast = parse_suite(r"x + y = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_unary_op() {
let ast = parse_suite(r"-x = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_lambda() {
let ast = parse_suite(r"(lambda _: 1) = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 1,
},
)
"###);
}
#[test]
fn err_if_exp() {
let ast = parse_suite(r"a if b else c = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_dict() {
let ast = parse_suite(r"{'a':5} = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_set() {
let ast = parse_suite(r"{a} = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_list_comp() {
let ast = parse_suite(r"[x for x in xs] = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_set_comp() {
let ast = parse_suite(r"{x for x in xs} = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_dict_comp() {
let ast = parse_suite(r"{x: x*2 for x in xs} = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_generator_exp() {
let ast = parse_suite(r"(x for x in xs) = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_await() {
let ast = parse_suite(r"await x = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_yield() {
let ast = parse_suite(r"(yield x) = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 1,
},
)
"###);
}
#[test]
fn err_yield_from() {
let ast = parse_suite(r"(yield from xs) = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 1,
},
)
"###);
}
#[test]
fn err_compare() {
let ast = parse_suite(r"a < b < c = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_call() {
let ast = parse_suite(r"foo() = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_formatted_value() {
// N.B. It looks like the parser can't generate a top-level
// FormattedValue, where as the official Python AST permits
// representing a single f-string containing just a variable as a
// FormattedValue directly.
//
// Bottom line is that because of this, this test is (at present)
// duplicative with the `fstring` test. That is, in theory these tests
// could fail independently, but in practice their failure or success
// is coupled.
//
// See: https://docs.python.org/3/library/ast.html#ast.FormattedValue
let ast = parse_suite(r#"f"{quux}" = 42"#);
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_fstring() {
let ast = parse_suite(r#"f"{foo} and {bar}" = 42"#);
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_string_literal() {
let ast = parse_suite(r#""foo" = 42"#);
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_bytes_literal() {
let ast = parse_suite(r#"b"foo" = 42"#);
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_number_literal() {
let ast = parse_suite(r"123 = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_boolean_literal() {
let ast = parse_suite(r"True = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_none_literal() {
let ast = parse_suite(r"None = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_ellipsis_literal() {
let ast = parse_suite(r"... = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 0,
},
)
"###);
}
#[test]
fn err_starred() {
let ast = parse_suite(r"*foo() = 42");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 1,
},
)
"###);
}
#[test]
fn err_list() {
let ast = parse_suite(r"[x, foo(), y] = [42, 42, 42]");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 4,
},
)
"###);
}
#[test]
fn err_list_nested() {
let ast = parse_suite(r"[[a, b], [[42]], d] = [[1, 2], [[3]], 4]");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 11,
},
)
"###);
}
#[test]
fn err_tuple() {
let ast = parse_suite(r"(x, foo(), y) = (42, 42, 42)");
insta::assert_debug_snapshot!(ast, @r###"
Err(
ParseError {
error: Lexical(
AssignmentError,
),
offset: 4,
},
)
"###);
}
// This last group of tests checks that assignments we expect to be parsed
// (including some interesting ones) continue to be parsed successfully.
#[test]
fn ok_starred() {
let ast = parse_suite(r"*foo = 42");
insta::assert_debug_snapshot!(ast);
}
#[test]
fn ok_list() {
let ast = parse_suite(r"[x, y, z] = [1, 2, 3]");
insta::assert_debug_snapshot!(ast);
}
#[test]
fn ok_tuple() {
let ast = parse_suite(r"(x, y, z) = (1, 2, 3)");
insta::assert_debug_snapshot!(ast);
}
#[test]
fn ok_subscript_normal() {
let ast = parse_suite(r"x[0] = 42");
insta::assert_debug_snapshot!(ast);
}
// This is actually a type error, not a syntax error. So check that it
// doesn't fail parsing.
#[test]
fn ok_subscript_weird() {
let ast = parse_suite(r"5[0] = 42");
insta::assert_debug_snapshot!(ast);
}
#[test]
fn ok_slice_normal() {
let ast = parse_suite(r"x[1:2] = [42]");
insta::assert_debug_snapshot!(ast);
}
// This is actually a type error, not a syntax error. So check that it
// doesn't fail parsing.
#[test]
fn ok_slice_weird() {
let ast = parse_suite(r"5[1:2] = [42]");
insta::assert_debug_snapshot!(ast);
}
#[test]
fn ok_attribute_normal() {
let ast = parse_suite(r"foo.bar = 42");
insta::assert_debug_snapshot!(ast);
}
// This is actually an attribute error, not a syntax error. So check that
// it doesn't fail parsing.
#[test]
fn ok_attribute_weird() {
let ast = parse_suite(r#""foo".y = 42"#);
insta::assert_debug_snapshot!(ast);
}
#[test]
fn ok_name() {
let ast = parse_suite(r"foo = 42");
insta::assert_debug_snapshot!(ast);
}
// This is a sanity test for what looks like an ipython directive being
// assigned to. Although this doesn't actually parse as an assignment
// statement, but rather, a directive whose value is `foo = 42`.
#[test]
fn ok_ipy_escape_command() {
use crate::Mode;
let src = r"!foo = 42";
let tokens = crate::lexer::lex(src, Mode::Ipython);
let ast = crate::parse_tokens(tokens.collect(), src, Mode::Ipython);
insta::assert_debug_snapshot!(ast);
}
#[test]
fn ok_assignment_expr() {
let ast = parse_suite(r"(x := 5)");
insta::assert_debug_snapshot!(ast);
}
}

View File

@@ -1,18 +1,18 @@
use ruff_python_ast::{self as ast, Expr, ExprContext};
pub(crate) fn set_context(expr: Expr, ctx: ExprContext) -> Expr {
pub(super) fn set_context(expr: Expr, ctx: ExprContext) -> Expr {
match expr {
Expr::Name(ast::ExprName { id, range, .. }) => ast::ExprName { range, id, ctx }.into(),
Expr::Tuple(ast::ExprTuple {
elts,
range,
parenthesized: is_parenthesized,
parenthesized,
ctx: _,
}) => ast::ExprTuple {
elts: elts.into_iter().map(|elt| set_context(elt, ctx)).collect(),
range,
ctx,
parenthesized: is_parenthesized,
parenthesized,
}
.into(),
@@ -55,7 +55,7 @@ pub(crate) fn set_context(expr: Expr, ctx: ExprContext) -> Expr {
#[cfg(test)]
mod tests {
use crate::parser::parse_suite;
use crate::parse_suite;
#[test]
fn test_assign_name() {

View File

@@ -0,0 +1,138 @@
use std::hash::BuildHasherDefault;
// Contains functions that perform validation and parsing of arguments and parameters.
// Checks apply both to functions and to lambdas.
use crate::lexer::{LexicalError, LexicalErrorType};
use ruff_python_ast::{self as ast};
use ruff_text_size::{Ranged, TextRange, TextSize};
use rustc_hash::FxHashSet;
pub(crate) struct ArgumentList {
pub(crate) args: Vec<ast::Expr>,
pub(crate) keywords: Vec<ast::Keyword>,
}
// Perform validation of function/lambda arguments in a function definition.
pub(super) fn validate_arguments(arguments: &ast::Parameters) -> Result<(), LexicalError> {
let mut all_arg_names = FxHashSet::with_capacity_and_hasher(
arguments.posonlyargs.len()
+ arguments.args.len()
+ usize::from(arguments.vararg.is_some())
+ arguments.kwonlyargs.len()
+ usize::from(arguments.kwarg.is_some()),
BuildHasherDefault::default(),
);
let posonlyargs = arguments.posonlyargs.iter();
let args = arguments.args.iter();
let kwonlyargs = arguments.kwonlyargs.iter();
let vararg: Option<&ast::Parameter> = arguments.vararg.as_deref();
let kwarg: Option<&ast::Parameter> = arguments.kwarg.as_deref();
for arg in posonlyargs
.chain(args)
.chain(kwonlyargs)
.map(|arg| &arg.parameter)
.chain(vararg)
.chain(kwarg)
{
let range = arg.range;
let arg_name = arg.name.as_str();
if !all_arg_names.insert(arg_name) {
return Err(LexicalError::new(
LexicalErrorType::DuplicateArgumentError(arg_name.to_string().into_boxed_str()),
range,
));
}
}
Ok(())
}
pub(super) fn validate_pos_params(
args: &(
Vec<ast::ParameterWithDefault>,
Vec<ast::ParameterWithDefault>,
),
) -> Result<(), LexicalError> {
let (posonlyargs, args) = args;
#[allow(clippy::skip_while_next)]
let first_invalid = posonlyargs
.iter()
.chain(args.iter()) // for all args
.skip_while(|arg| arg.default.is_none()) // starting with args without default
.skip_while(|arg| arg.default.is_some()) // and then args with default
.next(); // there must not be any more args without default
if let Some(invalid) = first_invalid {
return Err(LexicalError::new(
LexicalErrorType::DefaultArgumentError,
invalid.parameter.range(),
));
}
Ok(())
}
type FunctionArgument = (
Option<(TextSize, TextSize, Option<ast::Identifier>)>,
ast::Expr,
);
// Parse arguments as supplied during a function/lambda *call*.
pub(super) fn parse_arguments(
function_arguments: Vec<FunctionArgument>,
) -> Result<ArgumentList, LexicalError> {
let mut args = vec![];
let mut keywords = vec![];
let mut keyword_names = FxHashSet::with_capacity_and_hasher(
function_arguments.len(),
BuildHasherDefault::default(),
);
let mut double_starred = false;
for (name, value) in function_arguments {
if let Some((start, end, name)) = name {
// Check for duplicate keyword arguments in the call.
if let Some(keyword_name) = &name {
if !keyword_names.insert(keyword_name.to_string()) {
return Err(LexicalError::new(
LexicalErrorType::DuplicateKeywordArgumentError(
keyword_name.to_string().into_boxed_str(),
),
TextRange::new(start, end),
));
}
} else {
double_starred = true;
}
keywords.push(ast::Keyword {
arg: name,
value,
range: TextRange::new(start, end),
});
} else {
// Positional arguments mustn't follow keyword arguments.
if !keywords.is_empty() && !is_starred(&value) {
return Err(LexicalError::new(
LexicalErrorType::PositionalArgumentError,
value.range(),
));
// Allow starred arguments after keyword arguments but
// not after double-starred arguments.
} else if double_starred {
return Err(LexicalError::new(
LexicalErrorType::UnpackedArgumentError,
value.range(),
));
}
args.push(value);
}
}
Ok(ArgumentList { args, keywords })
}
// Check if an expression is a starred expression.
const fn is_starred(exp: &ast::Expr) -> bool {
exp.is_starred_expr()
}

View File

@@ -0,0 +1,93 @@
/*!
Defines some helper routines for rejecting invalid Python programs.
These routines are named in a way that supports qualified use. For example,
`invalid::assignment_targets`.
*/
use ruff_python_ast::Expr;
use ruff_text_size::TextRange;
use crate::lexer::{LexicalError, LexicalErrorType};
/// Returns an error for invalid assignment targets.
///
/// # Errors
///
/// This returns an error when any of the given expressions are themselves
/// or contain an expression that is invalid on the left hand side of an
/// assignment. For example, all literal expressions are invalid assignment
/// targets.
pub(crate) fn assignment_targets(targets: &[Expr]) -> Result<(), LexicalError> {
for t in targets {
assignment_target(t)?;
}
Ok(())
}
/// Returns an error if the given target is invalid for the left hand side of
/// an assignment.
///
/// # Errors
///
/// This returns an error when the given expression is itself or contains an
/// expression that is invalid on the left hand side of an assignment. For
/// example, all literal expressions are invalid assignment targets.
pub(crate) fn assignment_target(target: &Expr) -> Result<(), LexicalError> {
// Allowing a glob import here because of its limited scope.
#[allow(clippy::enum_glob_use)]
use self::Expr::*;
let err = |location: TextRange| -> LexicalError {
let error = LexicalErrorType::AssignmentError;
LexicalError::new(error, location)
};
match *target {
BoolOp(ref e) => Err(err(e.range)),
Named(ref e) => Err(err(e.range)),
BinOp(ref e) => Err(err(e.range)),
UnaryOp(ref e) => Err(err(e.range)),
Lambda(ref e) => Err(err(e.range)),
If(ref e) => Err(err(e.range)),
Dict(ref e) => Err(err(e.range)),
Set(ref e) => Err(err(e.range)),
ListComp(ref e) => Err(err(e.range)),
SetComp(ref e) => Err(err(e.range)),
DictComp(ref e) => Err(err(e.range)),
Generator(ref e) => Err(err(e.range)),
Await(ref e) => Err(err(e.range)),
Yield(ref e) => Err(err(e.range)),
YieldFrom(ref e) => Err(err(e.range)),
Compare(ref e) => Err(err(e.range)),
Call(ref e) => Err(err(e.range)),
// FString is recursive, but all its forms are invalid as an
// assignment target, so we can reject it without exploring it.
FString(ref e) => Err(err(e.range)),
StringLiteral(ref e) => Err(err(e.range)),
BytesLiteral(ref e) => Err(err(e.range)),
NumberLiteral(ref e) => Err(err(e.range)),
BooleanLiteral(ref e) => Err(err(e.range)),
NoneLiteral(ref e) => Err(err(e.range)),
EllipsisLiteral(ref e) => Err(err(e.range)),
// This isn't in the Python grammar but is Jupyter notebook specific.
// It seems like this should be an error. It does also seem like the
// parser prevents this from ever appearing as an assignment target
// anyway. ---AG
IpyEscapeCommand(ref e) => Err(err(e.range)),
// The only nested expressions allowed as an assignment target
// are star exprs, lists and tuples.
Starred(ref e) => assignment_target(&e.value),
List(ref e) => assignment_targets(&e.elts),
Tuple(ref e) => assignment_targets(&e.elts),
// Subscript is recursive and can be invalid, but aren't syntax errors.
// For example, `5[1] = 42` is a type error.
Subscript(_) => Ok(()),
// Similar to Subscript, e.g., `5[1:2] = [42]` is a type error.
Slice(_) => Ok(()),
// Similar to Subscript, e.g., `"foo".y = 42` is an attribute error.
Attribute(_) => Ok(()),
// These are always valid as assignment targets.
Name(_) => Ok(()),
}
}

View File

@@ -0,0 +1,311 @@
//! The LALRPOP based parser implementation.
use itertools::Itertools;
use lalrpop_util::ParseError as LalrpopError;
use ruff_python_ast::{
Expr, ExprAttribute, ExprAwait, ExprBinOp, ExprBoolOp, ExprBooleanLiteral, ExprBytesLiteral,
ExprCall, ExprCompare, ExprDict, ExprDictComp, ExprEllipsisLiteral, ExprFString, ExprGenerator,
ExprIf, ExprIpyEscapeCommand, ExprLambda, ExprList, ExprListComp, ExprName, ExprNamed,
ExprNoneLiteral, ExprNumberLiteral, ExprSet, ExprSetComp, ExprSlice, ExprStarred,
ExprStringLiteral, ExprSubscript, ExprTuple, ExprUnaryOp, ExprYield, ExprYieldFrom, Mod,
};
use ruff_text_size::{Ranged, TextRange, TextSize};
use crate::lexer::{LexResult, LexicalError, LexicalErrorType};
use crate::{Mode, ParseError, ParseErrorType, Tok};
mod context;
mod function;
mod invalid;
#[rustfmt::skip]
#[allow(unreachable_pub)]
#[allow(clippy::type_complexity)]
#[allow(clippy::extra_unused_lifetimes)]
#[allow(clippy::needless_lifetimes)]
#[allow(clippy::unused_self)]
#[allow(clippy::cast_sign_loss)]
#[allow(clippy::default_trait_access)]
#[allow(clippy::let_unit_value)]
#[allow(clippy::just_underscores_and_digits)]
#[allow(clippy::no_effect_underscore_binding)]
#[allow(clippy::trivially_copy_pass_by_ref)]
#[allow(clippy::option_option)]
#[allow(clippy::unnecessary_wraps)]
#[allow(clippy::uninlined_format_args)]
#[allow(clippy::cloned_instead_of_copied)]
mod python {
#[cfg(feature = "lalrpop")]
include!(concat!(env!("OUT_DIR"), "/src/lalrpop/python.rs"));
#[cfg(not(feature = "lalrpop"))]
include!("python.rs");
}
pub(crate) fn parse_tokens(
tokens: Vec<LexResult>,
source: &str,
mode: Mode,
) -> Result<Mod, ParseError> {
let marker_token = (Tok::start_marker(mode), TextRange::default());
let lexer = std::iter::once(Ok(marker_token)).chain(
tokens
.into_iter()
.filter_ok(|token| !matches!(token, (Tok::Comment(..) | Tok::NonLogicalNewline, _))),
);
python::TopParser::new()
.parse(
source,
mode,
lexer.map_ok(|(t, range)| (range.start(), t, range.end())),
)
.map_err(parse_error_from_lalrpop)
}
fn parse_error_from_lalrpop(err: LalrpopError<TextSize, Tok, LexicalError>) -> ParseError {
match err {
// TODO: Are there cases where this isn't an EOF?
LalrpopError::InvalidToken { location } => ParseError {
error: ParseErrorType::Eof,
location: TextRange::empty(location),
},
LalrpopError::ExtraToken { token } => ParseError {
error: ParseErrorType::ExtraToken(token.1),
location: TextRange::new(token.0, token.2),
},
LalrpopError::User { error } => ParseError {
location: error.location(),
error: ParseErrorType::Lexical(error.into_error()),
},
LalrpopError::UnrecognizedToken { token, expected } => {
// Hacky, but it's how CPython does it. See PyParser_AddToken,
// in particular "Only one possible expected token" comment.
let expected = (expected.len() == 1).then(|| expected[0].clone());
ParseError {
error: ParseErrorType::UnrecognizedToken(token.1, expected),
location: TextRange::new(token.0, token.2),
}
}
LalrpopError::UnrecognizedEof { location, expected } => {
// This could be an initial indentation error that we should ignore
let indent_error = expected == ["Indent"];
if indent_error {
ParseError {
error: ParseErrorType::Lexical(LexicalErrorType::IndentationError),
location: TextRange::empty(location),
}
} else {
ParseError {
error: ParseErrorType::Eof,
location: TextRange::empty(location),
}
}
}
}
}
/// An expression that may be parenthesized.
#[derive(Clone, Debug)]
struct ParenthesizedExpr {
/// The range of the expression, including any parentheses.
range: TextRange,
/// The underlying expression.
expr: Expr,
}
impl ParenthesizedExpr {
/// Returns `true` if the expression is parenthesized.
fn is_parenthesized(&self) -> bool {
self.range.start() != self.expr.range().start()
}
}
impl Ranged for ParenthesizedExpr {
fn range(&self) -> TextRange {
self.range
}
}
impl From<Expr> for ParenthesizedExpr {
fn from(expr: Expr) -> Self {
ParenthesizedExpr {
range: expr.range(),
expr,
}
}
}
impl From<ParenthesizedExpr> for Expr {
fn from(parenthesized_expr: ParenthesizedExpr) -> Self {
parenthesized_expr.expr
}
}
impl From<ExprIpyEscapeCommand> for ParenthesizedExpr {
fn from(payload: ExprIpyEscapeCommand) -> Self {
Expr::IpyEscapeCommand(payload).into()
}
}
impl From<ExprBoolOp> for ParenthesizedExpr {
fn from(payload: ExprBoolOp) -> Self {
Expr::BoolOp(payload).into()
}
}
impl From<ExprNamed> for ParenthesizedExpr {
fn from(payload: ExprNamed) -> Self {
Expr::Named(payload).into()
}
}
impl From<ExprBinOp> for ParenthesizedExpr {
fn from(payload: ExprBinOp) -> Self {
Expr::BinOp(payload).into()
}
}
impl From<ExprUnaryOp> for ParenthesizedExpr {
fn from(payload: ExprUnaryOp) -> Self {
Expr::UnaryOp(payload).into()
}
}
impl From<ExprLambda> for ParenthesizedExpr {
fn from(payload: ExprLambda) -> Self {
Expr::Lambda(payload).into()
}
}
impl From<ExprIf> for ParenthesizedExpr {
fn from(payload: ExprIf) -> Self {
Expr::If(payload).into()
}
}
impl From<ExprDict> for ParenthesizedExpr {
fn from(payload: ExprDict) -> Self {
Expr::Dict(payload).into()
}
}
impl From<ExprSet> for ParenthesizedExpr {
fn from(payload: ExprSet) -> Self {
Expr::Set(payload).into()
}
}
impl From<ExprListComp> for ParenthesizedExpr {
fn from(payload: ExprListComp) -> Self {
Expr::ListComp(payload).into()
}
}
impl From<ExprSetComp> for ParenthesizedExpr {
fn from(payload: ExprSetComp) -> Self {
Expr::SetComp(payload).into()
}
}
impl From<ExprDictComp> for ParenthesizedExpr {
fn from(payload: ExprDictComp) -> Self {
Expr::DictComp(payload).into()
}
}
impl From<ExprGenerator> for ParenthesizedExpr {
fn from(payload: ExprGenerator) -> Self {
Expr::Generator(payload).into()
}
}
impl From<ExprAwait> for ParenthesizedExpr {
fn from(payload: ExprAwait) -> Self {
Expr::Await(payload).into()
}
}
impl From<ExprYield> for ParenthesizedExpr {
fn from(payload: ExprYield) -> Self {
Expr::Yield(payload).into()
}
}
impl From<ExprYieldFrom> for ParenthesizedExpr {
fn from(payload: ExprYieldFrom) -> Self {
Expr::YieldFrom(payload).into()
}
}
impl From<ExprCompare> for ParenthesizedExpr {
fn from(payload: ExprCompare) -> Self {
Expr::Compare(payload).into()
}
}
impl From<ExprCall> for ParenthesizedExpr {
fn from(payload: ExprCall) -> Self {
Expr::Call(payload).into()
}
}
impl From<ExprFString> for ParenthesizedExpr {
fn from(payload: ExprFString) -> Self {
Expr::FString(payload).into()
}
}
impl From<ExprStringLiteral> for ParenthesizedExpr {
fn from(payload: ExprStringLiteral) -> Self {
Expr::StringLiteral(payload).into()
}
}
impl From<ExprBytesLiteral> for ParenthesizedExpr {
fn from(payload: ExprBytesLiteral) -> Self {
Expr::BytesLiteral(payload).into()
}
}
impl From<ExprNumberLiteral> for ParenthesizedExpr {
fn from(payload: ExprNumberLiteral) -> Self {
Expr::NumberLiteral(payload).into()
}
}
impl From<ExprBooleanLiteral> for ParenthesizedExpr {
fn from(payload: ExprBooleanLiteral) -> Self {
Expr::BooleanLiteral(payload).into()
}
}
impl From<ExprNoneLiteral> for ParenthesizedExpr {
fn from(payload: ExprNoneLiteral) -> Self {
Expr::NoneLiteral(payload).into()
}
}
impl From<ExprEllipsisLiteral> for ParenthesizedExpr {
fn from(payload: ExprEllipsisLiteral) -> Self {
Expr::EllipsisLiteral(payload).into()
}
}
impl From<ExprAttribute> for ParenthesizedExpr {
fn from(payload: ExprAttribute) -> Self {
Expr::Attribute(payload).into()
}
}
impl From<ExprSubscript> for ParenthesizedExpr {
fn from(payload: ExprSubscript) -> Self {
Expr::Subscript(payload).into()
}
}
impl From<ExprStarred> for ParenthesizedExpr {
fn from(payload: ExprStarred) -> Self {
Expr::Starred(payload).into()
}
}
impl From<ExprName> for ParenthesizedExpr {
fn from(payload: ExprName) -> Self {
Expr::Name(payload).into()
}
}
impl From<ExprList> for ParenthesizedExpr {
fn from(payload: ExprList) -> Self {
Expr::List(payload).into()
}
}
impl From<ExprTuple> for ParenthesizedExpr {
fn from(payload: ExprTuple) -> Self {
Expr::Tuple(payload).into()
}
}
impl From<ExprSlice> for ParenthesizedExpr {
fn from(payload: ExprSlice) -> Self {
Expr::Slice(payload).into()
}
}
#[cfg(target_pointer_width = "64")]
mod size_assertions {
use static_assertions::assert_eq_size;
use super::ParenthesizedExpr;
assert_eq_size!(ParenthesizedExpr, [u8; 72]);
}

View File

@@ -5,16 +5,19 @@
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};
use ruff_python_ast::{self as ast, Int, IpyEscapeKind};
use super::{
function::{ArgumentList, parse_arguments, validate_pos_params, validate_arguments},
context::set_context,
ParenthesizedExpr,
invalid,
};
use crate::{
FStringErrorType,
Mode,
lexer::{LexicalError, LexicalErrorType},
function::{ArgumentList, parse_arguments, validate_pos_params, validate_arguments},
context::set_context,
string::{StringType, concatenated_strings, parse_fstring_literal_element, parse_string_literal},
string_token_flags::StringKind,
token,
invalid,
};
use lalrpop_util::ParseError;
@@ -157,33 +160,33 @@ ExpressionStatement: ast::Stmt = {
},
};
AssignSuffix: crate::parser::ParenthesizedExpr = {
AssignSuffix: ParenthesizedExpr = {
"=" <e:TestListOrYieldExpr> => e,
"=" <e:IpyEscapeCommandExpr> => e
};
TestListOrYieldExpr: crate::parser::ParenthesizedExpr = {
TestListOrYieldExpr: ParenthesizedExpr = {
TestList,
YieldExpr
}
#[inline]
TestOrStarExprList: crate::parser::ParenthesizedExpr = {
TestOrStarExprList: ParenthesizedExpr = {
// as far as I can tell, these were the same
TestList
};
TestOrStarExpr: crate::parser::ParenthesizedExpr = {
TestOrStarExpr: ParenthesizedExpr = {
Test<"all">,
StarExpr,
};
NamedOrStarExpr: crate::parser::ParenthesizedExpr = {
NamedOrStarExpr: ParenthesizedExpr = {
NamedExpression,
StarExpr,
};
TestOrStarNamedExpr: crate::parser::ParenthesizedExpr = {
TestOrStarNamedExpr: ParenthesizedExpr = {
NamedExpressionTest,
StarExpr,
};
@@ -340,20 +343,20 @@ IpyEscapeCommandStatement: ast::Stmt = {
} else {
Err(LexicalError::new(
LexicalErrorType::OtherError("IPython escape commands are only allowed in `Mode::Ipython`".to_string().into_boxed_str()),
location,
(location..end_location).into(),
))?
}
}
}
IpyEscapeCommandExpr: crate::parser::ParenthesizedExpr = {
IpyEscapeCommandExpr: ParenthesizedExpr = {
<location:@L> <c:ipy_escape_command> <end_location:@R> =>? {
if mode == Mode::Ipython {
// This should never occur as the lexer won't allow it.
if !matches!(c.0, IpyEscapeKind::Magic | IpyEscapeKind::Shell) {
return Err(LexicalError::new(
LexicalErrorType::OtherError("IPython escape command expr is only allowed for % and !".to_string().into_boxed_str()),
location,
(location..end_location).into(),
))?;
}
Ok(ast::ExprIpyEscapeCommand {
@@ -364,7 +367,7 @@ IpyEscapeCommandExpr: crate::parser::ParenthesizedExpr = {
} else {
Err(LexicalError::new(
LexicalErrorType::OtherError("IPython escape commands are only allowed in `Mode::Ipython`".to_string().into_boxed_str()),
location,
(location..end_location).into(),
))?
}
}
@@ -384,7 +387,7 @@ IpyHelpEndEscapeCommandStatement: ast::Stmt = {
let ast::Expr::NumberLiteral(ast::ExprNumberLiteral { value: ast::Number::Int(integer), .. }) = slice.as_ref() else {
return Err(LexicalError::new(
LexicalErrorType::OtherError("only integer literals are allowed in Subscript expressions in help end escape command".to_string().into_boxed_str()),
range.start(),
*range,
));
};
unparse_expr(value, buffer)?;
@@ -400,7 +403,7 @@ IpyHelpEndEscapeCommandStatement: ast::Stmt = {
_ => {
return Err(LexicalError::new(
LexicalErrorType::OtherError("only Name, Subscript and Attribute expressions are allowed in help end escape command".to_string().into_boxed_str()),
expr.start(),
expr.range(),
));
}
}
@@ -411,7 +414,7 @@ IpyHelpEndEscapeCommandStatement: ast::Stmt = {
return Err(ParseError::User {
error: LexicalError::new(
LexicalErrorType::OtherError("IPython escape commands are only allowed in `Mode::Ipython`".to_string().into_boxed_str()),
location,
(location..end_location).into(),
),
});
}
@@ -423,7 +426,7 @@ IpyHelpEndEscapeCommandStatement: ast::Stmt = {
return Err(ParseError::User {
error: LexicalError::new(
LexicalErrorType::OtherError("maximum of 2 `?` tokens are allowed in help end escape command".to_string().into_boxed_str()),
location,
(location..end_location).into(),
),
});
}
@@ -566,7 +569,7 @@ AsPattern: ast::Pattern = {
if name.as_str() == "_" {
Err(LexicalError::new(
LexicalErrorType::OtherError("cannot use '_' as a target".to_string().into_boxed_str()),
location,
(location..end_location).into(),
))?
} else {
Ok(ast::Pattern::MatchAs(
@@ -633,13 +636,13 @@ StarPattern: ast::Pattern = {
}.into(),
}
NumberAtom: crate::parser::ParenthesizedExpr = {
NumberAtom: ParenthesizedExpr = {
<location:@L> <value:Number> <end_location:@R> => ast::Expr::NumberLiteral(
ast::ExprNumberLiteral { value, range: (location..end_location).into() }
).into(),
}
NumberExpr: crate::parser::ParenthesizedExpr = {
NumberExpr: ParenthesizedExpr = {
NumberAtom,
<location:@L> "-" <operand:NumberAtom> <end_location:@R> => ast::Expr::UnaryOp(
ast::ExprUnaryOp {
@@ -650,7 +653,7 @@ NumberExpr: crate::parser::ParenthesizedExpr = {
).into(),
}
AddOpExpr: crate::parser::ParenthesizedExpr = {
AddOpExpr: ParenthesizedExpr = {
<location:@L> <left:NumberExpr> <op:AddOp> <right:NumberAtom> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op,
@@ -1234,7 +1237,7 @@ ParameterListStarArgs<ParameterType, StarParameterType, DoubleStarParameterType>
if kwonlyargs.is_empty() {
return Err(LexicalError::new(
LexicalErrorType::OtherError("named arguments must follow bare *".to_string().into_boxed_str()),
location,
(location..location).into(),
))?;
}
@@ -1299,7 +1302,7 @@ Decorator: ast::Decorator = {
},
};
YieldExpr: crate::parser::ParenthesizedExpr = {
YieldExpr: ParenthesizedExpr = {
<location:@L> "yield" <value:TestList?> <end_location:@R> => ast::ExprYield {
value: value.map(ast::Expr::from).map(Box::new),
range: (location..end_location).into(),
@@ -1310,7 +1313,7 @@ YieldExpr: crate::parser::ParenthesizedExpr = {
}.into(),
};
Test<Goal>: crate::parser::ParenthesizedExpr = {
Test<Goal>: ParenthesizedExpr = {
<location:@L> <body:OrTest<"all">> "if" <test:OrTest<"all">> "else" <orelse:Test<"all">> <end_location:@R> => ast::ExprIf {
test: Box::new(test.into()),
body: Box::new(body.into()),
@@ -1321,12 +1324,12 @@ Test<Goal>: crate::parser::ParenthesizedExpr = {
LambdaDef,
};
NamedExpressionTest: crate::parser::ParenthesizedExpr = {
NamedExpressionTest: ParenthesizedExpr = {
NamedExpression,
Test<"all">,
}
NamedExpressionName: crate::parser::ParenthesizedExpr = {
NamedExpressionName: ParenthesizedExpr = {
<location:@L> <id:Identifier> <end_location:@R> => ast::ExprName {
id: id.into(),
ctx: ast::ExprContext::Store,
@@ -1334,7 +1337,7 @@ NamedExpressionName: crate::parser::ParenthesizedExpr = {
}.into(),
}
NamedExpression: crate::parser::ParenthesizedExpr = {
NamedExpression: ParenthesizedExpr = {
<location:@L> <target:NamedExpressionName> ":=" <value:Test<"all">> <end_location:@R> => {
ast::ExprNamed {
target: Box::new(target.into()),
@@ -1344,12 +1347,12 @@ NamedExpression: crate::parser::ParenthesizedExpr = {
},
};
LambdaDef: crate::parser::ParenthesizedExpr = {
LambdaDef: ParenthesizedExpr = {
<location:@L> "lambda" <location_args:@L> <parameters:ParameterList<UntypedParameter, StarUntypedParameter, DoubleStarUntypedParameter>?> <end_location_args:@R> ":" <fstring_middle:fstring_middle?> <body:Test<"all">> <end_location:@R> =>? {
if fstring_middle.is_some() {
return Err(LexicalError::new(
LexicalErrorType::FStringError(FStringErrorType::LambdaWithoutParentheses),
location,
(location..end_location).into(),
))?;
}
parameters.as_ref().map(validate_arguments).transpose()?;
@@ -1362,7 +1365,7 @@ LambdaDef: crate::parser::ParenthesizedExpr = {
}
}
OrTest<Goal>: crate::parser::ParenthesizedExpr = {
OrTest<Goal>: ParenthesizedExpr = {
<location:@L> <values:(<AndTest<"all">> "or")+> <last: AndTest<"all">> <end_location:@R> => {
let values = values.into_iter().chain(std::iter::once(last)).map(ast::Expr::from).collect();
ast::ExprBoolOp { op: ast::BoolOp::Or, values, range: (location..end_location).into() }.into()
@@ -1370,7 +1373,7 @@ OrTest<Goal>: crate::parser::ParenthesizedExpr = {
AndTest<Goal>,
};
AndTest<Goal>: crate::parser::ParenthesizedExpr = {
AndTest<Goal>: ParenthesizedExpr = {
<location:@L> <values:(<NotTest<"all">> "and")+> <last:NotTest<"all">> <end_location:@R> => {
let values = values.into_iter().chain(std::iter::once(last)).map(ast::Expr::from).collect();
ast::ExprBoolOp { op: ast::BoolOp::And, values, range: (location..end_location).into() }.into()
@@ -1378,7 +1381,7 @@ AndTest<Goal>: crate::parser::ParenthesizedExpr = {
NotTest<Goal>,
};
NotTest<Goal>: crate::parser::ParenthesizedExpr = {
NotTest<Goal>: ParenthesizedExpr = {
<location:@L> "not" <operand:NotTest<"all">> <end_location:@R> => ast::ExprUnaryOp {
operand: Box::new(operand.into()),
op: ast::UnaryOp::Not,
@@ -1387,7 +1390,7 @@ NotTest<Goal>: crate::parser::ParenthesizedExpr = {
Comparison<Goal>,
};
Comparison<Goal>: crate::parser::ParenthesizedExpr = {
Comparison<Goal>: ParenthesizedExpr = {
<location:@L> <left:Expression<"all">> <comparisons:(CompOp Expression<"all">)+> <end_location:@R> => {
let mut ops = Vec::with_capacity(comparisons.len());
let mut comparators = Vec::with_capacity(comparisons.len());
@@ -1418,7 +1421,7 @@ CompOp: ast::CmpOp = {
"is" "not" => ast::CmpOp::IsNot,
};
Expression<Goal>: crate::parser::ParenthesizedExpr = {
Expression<Goal>: ParenthesizedExpr = {
<location:@L> <left:Expression<"all">> "|" <right:XorExpression<"all">> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op: ast::Operator::BitOr,
@@ -1428,7 +1431,7 @@ Expression<Goal>: crate::parser::ParenthesizedExpr = {
XorExpression<Goal>,
};
XorExpression<Goal>: crate::parser::ParenthesizedExpr = {
XorExpression<Goal>: ParenthesizedExpr = {
<location:@L> <left:XorExpression<"all">> "^" <right:AndExpression<"all">> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op: ast::Operator::BitXor,
@@ -1438,7 +1441,7 @@ XorExpression<Goal>: crate::parser::ParenthesizedExpr = {
AndExpression<Goal>,
};
AndExpression<Goal>: crate::parser::ParenthesizedExpr = {
AndExpression<Goal>: ParenthesizedExpr = {
<location:@L> <left:AndExpression<"all">> "&" <right:ShiftExpression<"all">> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op: ast::Operator::BitAnd,
@@ -1448,7 +1451,7 @@ AndExpression<Goal>: crate::parser::ParenthesizedExpr = {
ShiftExpression<Goal>,
};
ShiftExpression<Goal>: crate::parser::ParenthesizedExpr = {
ShiftExpression<Goal>: ParenthesizedExpr = {
<location:@L> <left:ShiftExpression<"all">> <op:ShiftOp> <right:ArithmeticExpression<"all">> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op,
@@ -1463,7 +1466,7 @@ ShiftOp: ast::Operator = {
">>" => ast::Operator::RShift,
};
ArithmeticExpression<Goal>: crate::parser::ParenthesizedExpr = {
ArithmeticExpression<Goal>: ParenthesizedExpr = {
<location:@L> <left:ArithmeticExpression<"all">> <op:AddOp> <right:Term<"all">> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op,
@@ -1478,7 +1481,7 @@ AddOp: ast::Operator = {
"-" => ast::Operator::Sub,
};
Term<Goal>: crate::parser::ParenthesizedExpr = {
Term<Goal>: ParenthesizedExpr = {
<location:@L> <left:Term<"all">> <op:MulOp> <right:Factor<"all">> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op,
@@ -1496,7 +1499,7 @@ MulOp: ast::Operator = {
"@" => ast::Operator::MatMult,
};
Factor<Goal>: crate::parser::ParenthesizedExpr = {
Factor<Goal>: ParenthesizedExpr = {
<location:@L> <op:UnaryOp> <operand:Factor<"all">> <end_location:@R> => ast::ExprUnaryOp {
operand: Box::new(operand.into()),
op,
@@ -1511,7 +1514,7 @@ UnaryOp: ast::UnaryOp = {
"~" => ast::UnaryOp::Invert,
};
Power<Goal>: crate::parser::ParenthesizedExpr = {
Power<Goal>: ParenthesizedExpr = {
<location:@L> <left:AtomExpr<"all">> "**" <right:Factor<"all">> <end_location:@R> => ast::ExprBinOp {
left: Box::new(left.into()),
op: ast::Operator::Pow,
@@ -1521,14 +1524,14 @@ Power<Goal>: crate::parser::ParenthesizedExpr = {
AtomExpr<Goal>,
};
AtomExpr<Goal>: crate::parser::ParenthesizedExpr = {
AtomExpr<Goal>: ParenthesizedExpr = {
<location:@L> "await" <value:AtomExpr2<"all">> <end_location:@R> => {
ast::ExprAwait { value: Box::new(value.into()), range: (location..end_location).into() }.into()
},
AtomExpr2<Goal>,
}
AtomExpr2<Goal>: crate::parser::ParenthesizedExpr = {
AtomExpr2<Goal>: ParenthesizedExpr = {
Atom<Goal>,
<location:@L> <func:AtomExpr2<"all">> <arguments:Arguments> <end_location:@R> => ast::ExprCall {
func: Box::new(func.into()),
@@ -1549,14 +1552,14 @@ AtomExpr2<Goal>: crate::parser::ParenthesizedExpr = {
}.into(),
};
SubscriptList: crate::parser::ParenthesizedExpr = {
SubscriptList: ParenthesizedExpr = {
Subscript,
<location:@L> <s1:Subscript> "," <end_location:@R> => {
ast::ExprTuple {
elts: vec![s1.into()],
ctx: ast::ExprContext::Load,
range: (location..end_location).into(),
parenthesized: false
parenthesized: false,
}.into()
},
<location:@L> <elts:TwoOrMoreSep<Subscript, ",">> ","? <end_location:@R> => {
@@ -1565,12 +1568,12 @@ SubscriptList: crate::parser::ParenthesizedExpr = {
elts,
ctx: ast::ExprContext::Load,
range: (location..end_location).into(),
parenthesized: false
parenthesized: false,
}.into()
}
};
Subscript: crate::parser::ParenthesizedExpr = {
Subscript: ParenthesizedExpr = {
TestOrStarNamedExpr,
<location:@L> <lower:Test<"all">?> ":" <upper:Test<"all">?> <step:SliceOp?> <end_location:@R> => {
let lower = lower.map(ast::Expr::from).map(Box::new);
@@ -1582,7 +1585,7 @@ Subscript: crate::parser::ParenthesizedExpr = {
}
};
SliceOp: Option<crate::parser::ParenthesizedExpr> = {
SliceOp: Option<ParenthesizedExpr> = {
<location:@L> ":" <e:Test<"all">?> => e,
}
@@ -1619,7 +1622,7 @@ FStringMiddlePattern: ast::FStringElement = {
FStringReplacementField,
<location:@L> <fstring_middle:fstring_middle> <end_location:@R> =>? {
let (source, kind) = fstring_middle;
Ok(parse_fstring_literal_element(source, kind, (location..end_location).into())?)
Ok(ast::FStringElement::Literal(parse_fstring_literal_element(source, kind, (location..end_location).into())?))
}
};
@@ -1628,7 +1631,7 @@ FStringReplacementField: ast::FStringElement = {
if value.expr.is_lambda_expr() && !value.is_parenthesized() {
return Err(LexicalError::new(
LexicalErrorType::FStringError(FStringErrorType::LambdaWithoutParentheses),
value.start(),
value.range(),
))?;
}
let debug_text = debug.map(|_| {
@@ -1679,14 +1682,14 @@ FStringConversion: (TextSize, ast::ConversionFlag) = {
"a" => ast::ConversionFlag::Ascii,
_ => Err(LexicalError::new(
LexicalErrorType::FStringError(FStringErrorType::InvalidConversionFlag),
name_location,
(location..name_location).into(),
))?
};
Ok((location, conversion))
}
};
Atom<Goal>: crate::parser::ParenthesizedExpr = {
Atom<Goal>: ParenthesizedExpr = {
<expr:String> => expr.into(),
<location:@L> <value:Number> <end_location:@R> => ast::ExprNumberLiteral {
value,
@@ -1706,7 +1709,7 @@ Atom<Goal>: crate::parser::ParenthesizedExpr = {
},
<location:@L> "(" <elts:OneOrMore<Test<"all">>> <trailing_comma:","?> ")" <end_location:@R> if Goal != "no-withitems" => {
if elts.len() == 1 && trailing_comma.is_none() {
crate::parser::ParenthesizedExpr {
ParenthesizedExpr {
expr: elts.into_iter().next().unwrap().into(),
range: (location..end_location).into(),
}
@@ -1725,10 +1728,10 @@ Atom<Goal>: crate::parser::ParenthesizedExpr = {
if mid.expr.is_starred_expr() {
return Err(LexicalError::new(
LexicalErrorType::OtherError("cannot use starred expression here".to_string().into_boxed_str()),
mid.start(),
mid.range(),
))?;
}
Ok(crate::parser::ParenthesizedExpr {
Ok(ParenthesizedExpr {
expr: mid.into(),
range: (location..end_location).into(),
})
@@ -1746,9 +1749,9 @@ Atom<Goal>: crate::parser::ParenthesizedExpr = {
elts: Vec::new(),
ctx: ast::ExprContext::Load,
range: (location..end_location).into(),
parenthesized: true
parenthesized: true,
}.into(),
<location:@L> "(" <e:YieldExpr> ")" <end_location:@R> => crate::parser::ParenthesizedExpr {
<location:@L> "(" <e:YieldExpr> ")" <end_location:@R> => ParenthesizedExpr {
expr: e.into(),
range: (location..end_location).into(),
},
@@ -1761,7 +1764,7 @@ Atom<Goal>: crate::parser::ParenthesizedExpr = {
"(" <location:@L> "**" <e:Expression<"all">> ")" <end_location:@R> =>? {
Err(LexicalError::new(
LexicalErrorType::OtherError("cannot use double starred expression here".to_string().into_boxed_str()),
location,
(location..end_location).into(),
).into())
},
<location:@L> "{" <e:DictLiteralValues?> "}" <end_location:@R> => {
@@ -1798,37 +1801,37 @@ Atom<Goal>: crate::parser::ParenthesizedExpr = {
<location:@L> "..." <end_location:@R> => ast::ExprEllipsisLiteral { range: (location..end_location).into() }.into(),
};
ListLiteralValues: Vec<crate::parser::ParenthesizedExpr> = {
ListLiteralValues: Vec<ParenthesizedExpr> = {
<e:OneOrMore<TestOrStarNamedExpr>> ","? => e,
};
DictLiteralValues: Vec<(Option<Box<crate::parser::ParenthesizedExpr>>, crate::parser::ParenthesizedExpr)> = {
DictLiteralValues: Vec<(Option<Box<ParenthesizedExpr>>, ParenthesizedExpr)> = {
<elements:OneOrMore<DictElement>> ","? => elements,
};
DictEntry: (crate::parser::ParenthesizedExpr, crate::parser::ParenthesizedExpr) = {
DictEntry: (ParenthesizedExpr, ParenthesizedExpr) = {
<e1: Test<"all">> ":" <e2: Test<"all">> => (e1, e2),
};
DictElement: (Option<Box<crate::parser::ParenthesizedExpr>>, crate::parser::ParenthesizedExpr) = {
DictElement: (Option<Box<ParenthesizedExpr>>, ParenthesizedExpr) = {
<e:DictEntry> => (Some(Box::new(e.0)), e.1),
"**" <e:Expression<"all">> => (None, e),
};
SetLiteralValues: Vec<crate::parser::ParenthesizedExpr> = {
SetLiteralValues: Vec<ParenthesizedExpr> = {
<e1:OneOrMore<TestOrStarNamedExpr>> ","? => e1
};
ExpressionOrStarExpression: crate::parser::ParenthesizedExpr = {
ExpressionOrStarExpression: ParenthesizedExpr = {
Expression<"all">,
StarExpr
};
ExpressionList: crate::parser::ParenthesizedExpr = {
ExpressionList: ParenthesizedExpr = {
GenericList<ExpressionOrStarExpression>
};
ExpressionList2: Vec<crate::parser::ParenthesizedExpr> = {
ExpressionList2: Vec<ParenthesizedExpr> = {
<elements:OneOrMore<ExpressionOrStarExpression>> ","? => elements,
};
@@ -1837,14 +1840,14 @@ ExpressionList2: Vec<crate::parser::ParenthesizedExpr> = {
// - a single expression
// - a single expression followed by a trailing comma
#[inline]
TestList: crate::parser::ParenthesizedExpr = {
TestList: ParenthesizedExpr = {
GenericList<TestOrStarExpr>
};
GenericList<Element>: crate::parser::ParenthesizedExpr = {
GenericList<Element>: ParenthesizedExpr = {
<location:@L> <elts:OneOrMore<Element>> <trailing_comma:","?> <end_location:@R> => {
if elts.len() == 1 && trailing_comma.is_none() {
crate::parser::ParenthesizedExpr {
ParenthesizedExpr {
expr: elts.into_iter().next().unwrap().into(),
range: (location..end_location).into(),
}
@@ -1861,7 +1864,7 @@ GenericList<Element>: crate::parser::ParenthesizedExpr = {
}
// Test
StarExpr: crate::parser::ParenthesizedExpr = {
StarExpr: ParenthesizedExpr = {
<location:@L> "*" <value:Expression<"all">> <end_location:@R> => ast::ExprStarred {
value: Box::new(value.into()),
ctx: ast::ExprContext::Load,
@@ -1886,8 +1889,8 @@ SingleForComprehension: ast::Comprehension = {
}
};
ExpressionNoCond: crate::parser::ParenthesizedExpr = OrTest<"all">;
ComprehensionIf: crate::parser::ParenthesizedExpr = "if" <c:ExpressionNoCond> => c;
ExpressionNoCond: ParenthesizedExpr = OrTest<"all">;
ComprehensionIf: ParenthesizedExpr = "if" <c:ExpressionNoCond> => c;
Arguments: ast::Arguments = {
<location:@L> "(" <e: Comma<FunctionArgument>> ")" <end_location:@R> =>? {

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/context.rs
source: crates/ruff_python_parser/src/lalrpop/context.rs
expression: parse_ast
---
[

View File

@@ -36,12 +36,12 @@ use unicode_ident::{is_xid_continue, is_xid_start};
use ruff_python_ast::{Int, IpyEscapeKind};
use ruff_text_size::{TextLen, TextRange, TextSize};
use crate::error::FStringErrorType;
use crate::lexer::cursor::{Cursor, EOF_CHAR};
use crate::lexer::fstring::{FStringContext, FStrings};
use crate::lexer::indentation::{Indentation, Indentations};
use crate::{
soft_keywords::SoftKeywordTransformer,
string::FStringErrorType,
string_token_flags::{StringKind, StringPrefix},
token::Tok,
Mode,
@@ -287,7 +287,7 @@ impl<'source> Lexer<'source> {
Err(err) => {
return Err(LexicalError::new(
LexicalErrorType::OtherError(format!("{err:?}").into_boxed_str()),
self.token_range().start(),
self.token_range(),
));
}
};
@@ -312,7 +312,7 @@ impl<'source> Lexer<'source> {
if self.cursor.eat_char('_') {
return Err(LexicalError::new(
LexicalErrorType::OtherError("Invalid Syntax".to_string().into_boxed_str()),
self.offset() - TextSize::new(1),
TextRange::new(self.offset() - TextSize::new(1), self.offset()),
));
}
@@ -346,7 +346,7 @@ impl<'source> Lexer<'source> {
LexicalErrorType::OtherError(
"Invalid decimal literal".to_string().into_boxed_str(),
),
self.token_start(),
self.token_range(),
)
})?;
@@ -371,9 +371,11 @@ impl<'source> Lexer<'source> {
// Leading zeros in decimal integer literals are not permitted.
return Err(LexicalError::new(
LexicalErrorType::OtherError(
"Invalid Token".to_string().into_boxed_str(),
"Invalid decimal integer literal"
.to_string()
.into_boxed_str(),
),
self.token_range().start(),
self.token_range(),
));
}
value
@@ -381,7 +383,7 @@ impl<'source> Lexer<'source> {
Err(err) => {
return Err(LexicalError::new(
LexicalErrorType::OtherError(format!("{err:?}").into_boxed_str()),
self.token_range().start(),
self.token_range(),
))
}
};
@@ -598,7 +600,7 @@ impl<'source> Lexer<'source> {
};
return Err(LexicalError::new(
LexicalErrorType::FStringError(error),
self.offset(),
self.token_range(),
));
}
'\n' | '\r' if !fstring.is_triple_quoted() => {
@@ -611,7 +613,7 @@ impl<'source> Lexer<'source> {
}
return Err(LexicalError::new(
LexicalErrorType::FStringError(FStringErrorType::UnterminatedString),
self.offset(),
self.token_range(),
));
}
'\\' => {
@@ -721,22 +723,9 @@ impl<'source> Lexer<'source> {
let Some(index) = memchr::memchr(quote_byte, self.cursor.rest().as_bytes()) else {
self.cursor.skip_to_end();
if let Some(fstring) = self.fstrings.current() {
// When we are in an f-string, check whether the initial quote
// matches with f-strings quotes and if it is, then this must be a
// missing '}' token so raise the proper error.
if fstring.quote_char() == quote
&& fstring.is_triple_quoted() == kind.is_triple_quoted()
{
return Err(LexicalError::new(
LexicalErrorType::FStringError(FStringErrorType::UnclosedLbrace),
self.cursor.text_len(),
));
}
}
return Err(LexicalError::new(
LexicalErrorType::Eof,
self.cursor.text_len(),
LexicalErrorType::UnclosedStringError,
self.token_range(),
));
};
@@ -770,22 +759,9 @@ impl<'source> Lexer<'source> {
else {
self.cursor.skip_to_end();
if let Some(fstring) = self.fstrings.current() {
// When we are in an f-string, check whether the initial quote
// matches with f-strings quotes and if it is, then this must be a
// missing '}' token so raise the proper error.
if fstring.quote_char() == quote
&& fstring.is_triple_quoted() == kind.is_triple_quoted()
{
return Err(LexicalError::new(
LexicalErrorType::FStringError(FStringErrorType::UnclosedLbrace),
self.offset(),
));
}
}
return Err(LexicalError::new(
LexicalErrorType::StringError,
self.offset(),
self.token_range(),
));
};
@@ -811,26 +787,9 @@ impl<'source> Lexer<'source> {
match ch {
Some('\r' | '\n') => {
if let Some(fstring) = self.fstrings.current() {
// When we are in an f-string, check whether the initial quote
// matches with f-strings quotes and if it is, then this must be a
// missing '}' token so raise the proper error.
if fstring.quote_char() == quote && !fstring.is_triple_quoted() {
return Err(LexicalError::new(
LexicalErrorType::FStringError(
FStringErrorType::UnclosedLbrace,
),
self.offset() - TextSize::new(1),
));
}
}
return Err(LexicalError::new(
LexicalErrorType::OtherError(
"EOL while scanning string literal"
.to_string()
.into_boxed_str(),
),
self.offset() - TextSize::new(1),
LexicalErrorType::UnclosedStringError,
self.token_range(),
));
}
Some(ch) if ch == quote => {
@@ -879,7 +838,7 @@ impl<'source> Lexer<'source> {
self.pending_indentation = Some(indentation);
let offset = self.offset();
self.indentations.dedent_one(indentation).map_err(|_| {
LexicalError::new(LexicalErrorType::IndentationError, offset)
LexicalError::new(LexicalErrorType::IndentationError, self.token_range())
})?;
return Ok((Tok::Dedent, TextRange::empty(offset)));
}
@@ -887,7 +846,7 @@ impl<'source> Lexer<'source> {
Err(_) => {
return Err(LexicalError::new(
LexicalErrorType::IndentationError,
self.offset(),
self.token_range(),
));
}
}
@@ -913,7 +872,7 @@ impl<'source> Lexer<'source> {
} else {
Err(LexicalError::new(
LexicalErrorType::UnrecognizedToken { tok: c },
self.token_start(),
self.token_range(),
))
}
} else {
@@ -937,11 +896,11 @@ impl<'source> Lexer<'source> {
if self.cursor.eat_char('\r') {
self.cursor.eat_char('\n');
} else if self.cursor.is_eof() {
return Err(LexicalError::new(LexicalErrorType::Eof, self.token_start()));
return Err(LexicalError::new(LexicalErrorType::Eof, self.token_range()));
} else if !self.cursor.eat_char('\n') {
return Err(LexicalError::new(
LexicalErrorType::LineContinuationError,
self.token_start(),
self.token_range(),
));
}
}
@@ -975,11 +934,11 @@ impl<'source> Lexer<'source> {
if self.cursor.eat_char('\r') {
self.cursor.eat_char('\n');
} else if self.cursor.is_eof() {
return Err(LexicalError::new(LexicalErrorType::Eof, self.token_start()));
return Err(LexicalError::new(LexicalErrorType::Eof, self.token_range()));
} else if !self.cursor.eat_char('\n') {
return Err(LexicalError::new(
LexicalErrorType::LineContinuationError,
self.token_start(),
self.token_range(),
));
}
indentation = Indentation::root();
@@ -1017,7 +976,7 @@ impl<'source> Lexer<'source> {
self.pending_indentation = Some(indentation);
self.indentations.dedent_one(indentation).map_err(|_| {
LexicalError::new(LexicalErrorType::IndentationError, self.offset())
LexicalError::new(LexicalErrorType::IndentationError, self.token_range())
})?;
Some((Tok::Dedent, TextRange::empty(self.offset())))
@@ -1033,7 +992,7 @@ impl<'source> Lexer<'source> {
Err(_) => {
return Err(LexicalError::new(
LexicalErrorType::IndentationError,
self.offset(),
self.token_range(),
));
}
};
@@ -1047,7 +1006,7 @@ impl<'source> Lexer<'source> {
if self.nesting > 0 {
// Reset the nesting to avoid going into infinite loop.
self.nesting = 0;
return Err(LexicalError::new(LexicalErrorType::Eof, self.offset()));
return Err(LexicalError::new(LexicalErrorType::Eof, self.token_range()));
}
// Next, insert a trailing newline, if required.
@@ -1214,7 +1173,7 @@ impl<'source> Lexer<'source> {
if fstring.nesting() == self.nesting {
return Err(LexicalError::new(
LexicalErrorType::FStringError(FStringErrorType::SingleRbrace),
self.token_start(),
self.token_range(),
));
}
fstring.try_end_format_spec(self.nesting);
@@ -1308,7 +1267,7 @@ impl<'source> Lexer<'source> {
return Err(LexicalError::new(
LexicalErrorType::UnrecognizedToken { tok: c },
self.token_start(),
self.token_range(),
));
}
};
@@ -1372,12 +1331,12 @@ pub struct LexicalError {
/// The type of error that occurred.
error: LexicalErrorType,
/// The location of the error.
location: TextSize,
location: TextRange,
}
impl LexicalError {
/// Creates a new `LexicalError` with the given error type and location.
pub fn new(error: LexicalErrorType, location: TextSize) -> Self {
pub fn new(error: LexicalErrorType, location: TextRange) -> Self {
Self { error, location }
}
@@ -1389,7 +1348,7 @@ impl LexicalError {
self.error
}
pub fn location(&self) -> TextSize {
pub fn location(&self) -> TextRange {
self.location
}
}
@@ -1414,7 +1373,7 @@ impl std::fmt::Display for LexicalError {
f,
"{} at byte offset {}",
self.error(),
u32::from(self.location())
u32::from(self.location().start())
)
}
}
@@ -1429,9 +1388,14 @@ pub enum LexicalErrorType {
// to use the `UnicodeError` variant instead.
#[doc(hidden)]
StringError,
// TODO: Should take a start/end position to report.
/// A string literal without the closing quote.
UnclosedStringError,
/// Decoding of a unicode escape sequence in a string literal failed.
UnicodeError,
/// Missing the `{` for unicode escape sequence.
MissingUnicodeLbrace,
/// Missing the `}` for unicode escape sequence.
MissingUnicodeRbrace,
/// The nesting of brackets/braces/parentheses is not balanced.
NestingError,
/// The indentation is not consistent.
@@ -1453,6 +1417,8 @@ pub enum LexicalErrorType {
UnrecognizedToken { tok: char },
/// An f-string error containing the [`FStringErrorType`].
FStringError(FStringErrorType),
/// Invalid character encountered in a byte literal.
InvalidByteLiteral,
/// An unexpected character was encountered after a line continuation.
LineContinuationError,
/// An unexpected end of file was encountered.
@@ -1470,6 +1436,9 @@ impl std::fmt::Display for LexicalErrorType {
match self {
LexicalErrorType::StringError => write!(f, "Got unexpected string"),
LexicalErrorType::FStringError(error) => write!(f, "f-string: {error}"),
LexicalErrorType::InvalidByteLiteral => {
write!(f, "bytes can only contain ASCII literal characters")
}
LexicalErrorType::UnicodeError => write!(f, "Got unexpected unicode"),
LexicalErrorType::NestingError => write!(f, "Got unexpected nesting"),
LexicalErrorType::IndentationError => {
@@ -1508,6 +1477,15 @@ impl std::fmt::Display for LexicalErrorType {
LexicalErrorType::Eof => write!(f, "unexpected EOF while parsing"),
LexicalErrorType::AssignmentError => write!(f, "invalid assignment target"),
LexicalErrorType::OtherError(msg) => write!(f, "{msg}"),
LexicalErrorType::UnclosedStringError => {
write!(f, "missing closing quote in string literal")
}
LexicalErrorType::MissingUnicodeLbrace => {
write!(f, "Missing `{{` in Unicode escape sequence")
}
LexicalErrorType::MissingUnicodeRbrace => {
write!(f, "Missing `}}` in Unicode escape sequence")
}
}
}
}
@@ -2302,9 +2280,7 @@ f"{(lambda x:{x})}"
#[test]
fn test_fstring_error() {
use FStringErrorType::{
SingleRbrace, UnclosedLbrace, UnterminatedString, UnterminatedTripleQuotedString,
};
use FStringErrorType::{SingleRbrace, UnterminatedString, UnterminatedTripleQuotedString};
assert_eq!(lex_fstring_error("f'}'"), SingleRbrace);
assert_eq!(lex_fstring_error("f'{{}'"), SingleRbrace);
@@ -2315,18 +2291,6 @@ f"{(lambda x:{x})}"
assert_eq!(lex_fstring_error("f'{3:}}>10}'"), SingleRbrace);
assert_eq!(lex_fstring_error(r"f'\{foo}\}'"), SingleRbrace);
assert_eq!(lex_fstring_error("f'{'"), UnclosedLbrace);
assert_eq!(lex_fstring_error("f'{foo!r'"), UnclosedLbrace);
assert_eq!(lex_fstring_error("f'{foo='"), UnclosedLbrace);
assert_eq!(
lex_fstring_error(
r#"f"{"
"#
),
UnclosedLbrace
);
assert_eq!(lex_fstring_error(r#"f"""{""""#), UnclosedLbrace);
assert_eq!(lex_fstring_error(r#"f""#), UnterminatedString);
assert_eq!(lex_fstring_error(r"f'"), UnterminatedString);
@@ -2341,25 +2305,4 @@ f"{(lambda x:{x})}"
UnterminatedTripleQuotedString
);
}
#[test]
fn test_fstring_error_location() {
assert_debug_snapshot!(lex_error("f'{'"), @r###"
LexicalError {
error: FStringError(
UnclosedLbrace,
),
location: 4,
}
"###);
assert_debug_snapshot!(lex_error("f'{'α"), @r###"
LexicalError {
error: FStringError(
UnclosedLbrace,
),
location: 6,
}
"###);
}
}

View File

@@ -109,30 +109,227 @@
//! [parsing]: https://en.wikipedia.org/wiki/Parsing
//! [lexer]: crate::lexer
pub use parser::{
parse, parse_expression, parse_expression_starts_at, parse_program, parse_starts_at,
parse_suite, parse_tokens, ParseError, ParseErrorType,
};
use ruff_python_ast::{Mod, PySourceType, Suite};
pub use string::FStringErrorType;
use std::cell::Cell;
pub use error::{FStringErrorType, ParseError, ParseErrorType};
use lexer::{lex, lex_starts_at};
pub use parser::Program;
use ruff_python_ast::{Expr, Mod, ModModule, PySourceType, Suite};
use ruff_text_size::TextSize;
pub use string_token_flags::StringKind;
pub use token::{Tok, TokenKind};
use crate::lexer::LexResult;
mod context;
mod function;
mod invalid;
// Skip flattening lexer to distinguish from full ruff_python_parser
mod error;
mod lalrpop;
pub mod lexer;
mod parser;
mod soft_keywords;
mod string;
mod string_token_flags;
mod token;
mod token_set;
mod token_source;
pub mod typing;
thread_local! {
static NEW_PARSER: Cell<bool> = Cell::new(std::env::var("NEW_PARSER").is_ok());
}
/// Controls whether the current thread uses the new hand written or the old lalrpop based parser.
///
/// Uses the new hand written parser if `use_new_parser` is true.
///
/// Defaults to use the new handwritten parser if the environment variable `NEW_PARSER` is set.
pub fn set_new_parser(use_new_parser: bool) {
NEW_PARSER.set(use_new_parser);
}
/// Parse a full Python program usually consisting of multiple lines.
///
/// This is a convenience function that can be used to parse a full Python program without having to
/// specify the [`Mode`] or the location. It is probably what you want to use most of the time.
///
/// # Example
///
/// For example, parsing a simple function definition and a call to that function:
///
/// ```
/// use ruff_python_parser as parser;
/// let source = r#"
/// def foo():
/// return 42
///
/// print(foo())
/// "#;
/// let program = parser::parse_program(source);
/// assert!(program.is_ok());
/// ```
pub fn parse_program(source: &str) -> Result<ModModule, ParseError> {
let lexer = lex(source, Mode::Module);
match parse_tokens(lexer.collect(), source, Mode::Module)? {
Mod::Module(m) => Ok(m),
Mod::Expression(_) => unreachable!("Mode::Module doesn't return other variant"),
}
}
pub fn parse_suite(source: &str) -> Result<Suite, ParseError> {
parse_program(source).map(|m| m.body)
}
/// Parses a single Python expression.
///
/// This convenience function can be used to parse a single expression without having to
/// specify the Mode or the location.
///
/// # Example
///
/// For example, parsing a single expression denoting the addition of two numbers:
///
/// ```
/// use ruff_python_parser as parser;
/// let expr = parser::parse_expression("1 + 2");
///
/// assert!(expr.is_ok());
///
/// ```
pub fn parse_expression(source: &str) -> Result<Expr, ParseError> {
let lexer = lex(source, Mode::Expression).collect();
match parse_tokens(lexer, source, Mode::Expression)? {
Mod::Expression(expression) => Ok(*expression.body),
Mod::Module(_m) => unreachable!("Mode::Expression doesn't return other variant"),
}
}
/// Parses a Python expression from a given location.
///
/// This function allows to specify the location of the expression in the source code, other than
/// that, it behaves exactly like [`parse_expression`].
///
/// # Example
///
/// Parsing a single expression denoting the addition of two numbers, but this time specifying a different,
/// somewhat silly, location:
///
/// ```
/// use ruff_python_parser::{parse_expression_starts_at};
/// # use ruff_text_size::TextSize;
///
/// let expr = parse_expression_starts_at("1 + 2", TextSize::from(400));
/// assert!(expr.is_ok());
/// ```
pub fn parse_expression_starts_at(source: &str, offset: TextSize) -> Result<Expr, ParseError> {
let lexer = lex_starts_at(source, Mode::Module, offset).collect();
match parse_tokens(lexer, source, Mode::Expression)? {
Mod::Expression(expression) => Ok(*expression.body),
Mod::Module(_m) => unreachable!("Mode::Expression doesn't return other variant"),
}
}
/// Parse the given Python source code using the specified [`Mode`].
///
/// This function is the most general function to parse Python code. Based on the [`Mode`] supplied,
/// it can be used to parse a single expression, a full Python program, an interactive expression
/// or a Python program containing IPython escape commands.
///
/// # Example
///
/// If we want to parse a simple expression, we can use the [`Mode::Expression`] mode during
/// parsing:
///
/// ```
/// use ruff_python_parser::{Mode, parse};
///
/// let expr = parse("1 + 2", Mode::Expression);
/// assert!(expr.is_ok());
/// ```
///
/// Alternatively, we can parse a full Python program consisting of multiple lines:
///
/// ```
/// use ruff_python_parser::{Mode, parse};
///
/// let source = r#"
/// class Greeter:
///
/// def greet(self):
/// print("Hello, world!")
/// "#;
/// let program = parse(source, Mode::Module);
/// assert!(program.is_ok());
/// ```
///
/// Additionally, we can parse a Python program containing IPython escapes:
///
/// ```
/// use ruff_python_parser::{Mode, parse};
///
/// let source = r#"
/// %timeit 1 + 2
/// ?str.replace
/// !ls
/// "#;
/// let program = parse(source, Mode::Ipython);
/// assert!(program.is_ok());
/// ```
pub fn parse(source: &str, mode: Mode) -> Result<Mod, ParseError> {
let lxr = lexer::lex(source, mode);
parse_tokens(lxr.collect(), source, mode)
}
/// Parse the given Python source code using the specified [`Mode`] and [`TextSize`].
///
/// This function allows to specify the location of the the source code, other than
/// that, it behaves exactly like [`parse`].
///
/// # Example
///
/// ```
/// # use ruff_text_size::TextSize;
/// use ruff_python_parser::{Mode, parse_starts_at};
///
/// let source = r#"
/// def fib(i):
/// a, b = 0, 1
/// for _ in range(i):
/// a, b = b, a + b
/// return a
///
/// print(fib(42))
/// "#;
/// let program = parse_starts_at(source, Mode::Module, TextSize::from(0));
/// assert!(program.is_ok());
/// ```
pub fn parse_starts_at(source: &str, mode: Mode, offset: TextSize) -> Result<Mod, ParseError> {
let lxr = lexer::lex_starts_at(source, mode, offset);
parse_tokens(lxr.collect(), source, mode)
}
/// Parse an iterator of [`LexResult`]s using the specified [`Mode`].
///
/// This could allow you to perform some preprocessing on the tokens before parsing them.
///
/// # Example
///
/// As an example, instead of parsing a string, we can parse a list of tokens after we generate
/// them using the [`lexer::lex`] function:
///
/// ```
/// use ruff_python_parser::{lexer::lex, Mode, parse_tokens};
///
/// let source = "1 + 2";
/// let expr = parse_tokens(lex(source, Mode::Expression).collect(), source, Mode::Expression);
/// assert!(expr.is_ok());
/// ```
pub fn parse_tokens(tokens: Vec<LexResult>, source: &str, mode: Mode) -> Result<Mod, ParseError> {
if NEW_PARSER.get() {
crate::parser::parse_tokens(tokens, source, mode)
} else {
crate::lalrpop::parse_tokens(tokens, source, mode)
}
}
/// Collect tokens up to and including the first error.
pub fn tokenize(contents: &str, mode: Mode) -> Vec<LexResult> {
let mut tokens: Vec<LexResult> = allocate_tokens_vec(contents);
@@ -250,28 +447,3 @@ impl std::fmt::Display for ModeParseError {
write!(f, r#"mode must be "exec", "eval", "ipython", or "single""#)
}
}
#[rustfmt::skip]
#[allow(unreachable_pub)]
#[allow(clippy::type_complexity)]
#[allow(clippy::extra_unused_lifetimes)]
#[allow(clippy::needless_lifetimes)]
#[allow(clippy::unused_self)]
#[allow(clippy::cast_sign_loss)]
#[allow(clippy::default_trait_access)]
#[allow(clippy::let_unit_value)]
#[allow(clippy::just_underscores_and_digits)]
#[allow(clippy::no_effect_underscore_binding)]
#[allow(clippy::trivially_copy_pass_by_ref)]
#[allow(clippy::option_option)]
#[allow(clippy::unnecessary_wraps)]
#[allow(clippy::uninlined_format_args)]
#[allow(clippy::cloned_instead_of_copied)]
mod python {
#[cfg(feature = "lalrpop")]
include!(concat!(env!("OUT_DIR"), "/src/python.rs"));
#[cfg(not(feature = "lalrpop"))]
include!("python.rs");
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,148 @@
use std::hash::BuildHasherDefault;
use ast::CmpOp;
use ruff_python_ast::{self as ast, Expr, ExprContext};
use rustc_hash::FxHashSet;
use crate::{ParseError, ParseErrorType, TokenKind};
/// Set the `ctx` for `Expr::Id`, `Expr::Attribute`, `Expr::Subscript`, `Expr::Starred`,
/// `Expr::Tuple` and `Expr::List`. If `expr` is either `Expr::Tuple` or `Expr::List`,
/// recursively sets the `ctx` for their elements.
pub(super) fn set_expr_ctx(expr: &mut Expr, new_ctx: ExprContext) {
match expr {
Expr::Name(ast::ExprName { ctx, .. })
| Expr::Attribute(ast::ExprAttribute { ctx, .. })
| Expr::Subscript(ast::ExprSubscript { ctx, .. }) => *ctx = new_ctx,
Expr::Starred(ast::ExprStarred { value, ctx, .. }) => {
*ctx = new_ctx;
set_expr_ctx(value, new_ctx);
}
Expr::UnaryOp(ast::ExprUnaryOp { operand, .. }) => {
set_expr_ctx(operand, new_ctx);
}
Expr::List(ast::ExprList { elts, ctx, .. })
| Expr::Tuple(ast::ExprTuple { elts, ctx, .. }) => {
*ctx = new_ctx;
elts.iter_mut()
.for_each(|element| set_expr_ctx(element, new_ctx));
}
_ => {}
}
}
/// Check if the given expression is itself or contains an expression that is
/// valid on the left hand side of an assignment. For example, identifiers,
/// starred expressions, attribute expressions, subscript expressions,
/// list and tuple unpacking are valid assignment targets.
pub(super) fn is_valid_assignment_target(expr: &Expr) -> bool {
match expr {
Expr::Starred(ast::ExprStarred { value, .. }) => is_valid_assignment_target(value),
Expr::List(ast::ExprList { elts, .. }) | Expr::Tuple(ast::ExprTuple { elts, .. }) => {
elts.iter().all(is_valid_assignment_target)
}
Expr::Name(_) | Expr::Attribute(_) | Expr::Subscript(_) => true,
_ => false,
}
}
/// Check if the given expression is itself or contains an expression that is
/// valid on the left hand side of an augmented assignment. For example, identifiers,
/// attribute and subscript expressions are valid augmented assignment targets.
pub(super) fn is_valid_aug_assignment_target(expr: &Expr) -> bool {
matches!(
expr,
Expr::Name(_) | Expr::Attribute(_) | Expr::Subscript(_)
)
}
/// Check if the given expression is itself or contains an expression that is
/// valid as a target of a `del` statement.
pub(super) fn is_valid_del_target(expr: &Expr) -> bool {
// https://github.com/python/cpython/blob/d864b0094f9875c5613cbb0b7f7f3ca8f1c6b606/Parser/action_helpers.c#L1150-L1180
match expr {
Expr::List(ast::ExprList { elts, .. }) | Expr::Tuple(ast::ExprTuple { elts, .. }) => {
elts.iter().all(is_valid_del_target)
}
Expr::Name(_) | Expr::Attribute(_) | Expr::Subscript(_) => true,
_ => false,
}
}
/// Converts a [`TokenKind`] array of size 2 to its correspondent [`CmpOp`].
pub(super) fn token_kind_to_cmp_op(kind: [TokenKind; 2]) -> Result<CmpOp, ()> {
Ok(match kind {
[TokenKind::Is, TokenKind::Not] => CmpOp::IsNot,
[TokenKind::Is, _] => CmpOp::Is,
[TokenKind::In, _] => CmpOp::In,
[TokenKind::EqEqual, _] => CmpOp::Eq,
[TokenKind::Less, _] => CmpOp::Lt,
[TokenKind::Greater, _] => CmpOp::Gt,
[TokenKind::NotEqual, _] => CmpOp::NotEq,
[TokenKind::LessEqual, _] => CmpOp::LtE,
[TokenKind::GreaterEqual, _] => CmpOp::GtE,
[TokenKind::Not, TokenKind::In] => CmpOp::NotIn,
_ => return Err(()),
})
}
// Perform validation of function/lambda parameters in a function definition.
pub(super) fn validate_parameters(parameters: &ast::Parameters) -> Result<(), ParseError> {
let mut all_arg_names = FxHashSet::with_capacity_and_hasher(
parameters.posonlyargs.len()
+ parameters.args.len()
+ usize::from(parameters.vararg.is_some())
+ parameters.kwonlyargs.len()
+ usize::from(parameters.kwarg.is_some()),
BuildHasherDefault::default(),
);
let posonlyargs = parameters.posonlyargs.iter();
let args = parameters.args.iter();
let kwonlyargs = parameters.kwonlyargs.iter();
let vararg: Option<&ast::Parameter> = parameters.vararg.as_deref();
let kwarg: Option<&ast::Parameter> = parameters.kwarg.as_deref();
for arg in posonlyargs
.chain(args)
.chain(kwonlyargs)
.map(|arg| &arg.parameter)
.chain(vararg)
.chain(kwarg)
{
let range = arg.range;
let arg_name = arg.name.as_str();
if !all_arg_names.insert(arg_name) {
return Err(ParseError {
error: ParseErrorType::DuplicateArgumentError(arg_name.to_string()),
location: range,
});
}
}
Ok(())
}
pub(super) fn validate_arguments(arguments: &ast::Arguments) -> Result<(), ParseError> {
let mut all_arg_names = FxHashSet::with_capacity_and_hasher(
arguments.keywords.len(),
BuildHasherDefault::default(),
);
for (name, range) in arguments
.keywords
.iter()
.filter_map(|argument| argument.arg.as_ref().map(|arg| (arg, argument.range)))
{
let arg_name = name.as_str();
if !all_arg_names.insert(arg_name) {
return Err(ParseError {
error: ParseErrorType::DuplicateKeywordArgumentError(arg_name.to_string()),
location: range,
});
}
}
Ok(())
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,671 @@
use ruff_python_ast::{self as ast, Expr, ExprContext, Number, Operator, Pattern, Singleton};
use ruff_text_size::{Ranged, TextSize};
use crate::parser::progress::ParserProgress;
use crate::parser::{Parser, SequenceMatchPatternParentheses};
use crate::token_set::TokenSet;
use crate::{ParseErrorType, Tok, TokenKind};
use super::RecoveryContextKind;
/// The set of tokens that can start a literal pattern.
const LITERAL_PATTERN_START_SET: TokenSet = TokenSet::new([
TokenKind::None,
TokenKind::True,
TokenKind::False,
TokenKind::String,
TokenKind::Int,
TokenKind::Float,
TokenKind::Complex,
]);
/// The set of tokens that can start a pattern.
const PATTERN_START_SET: TokenSet = TokenSet::new([
// Star pattern
TokenKind::Star,
// Capture pattern
// Wildcard pattern ('_' is a name token)
// Value pattern (name or attribute)
// Class pattern
TokenKind::Name,
// Group pattern
TokenKind::Lpar,
// Sequence pattern
TokenKind::Lsqb,
// Mapping pattern
TokenKind::Lbrace,
])
.union(LITERAL_PATTERN_START_SET);
/// The set of tokens that can start a mapping pattern.
const MAPPING_PATTERN_START_SET: TokenSet = TokenSet::new([
// Double star pattern
TokenKind::DoubleStar,
// Value pattern
TokenKind::Name,
])
.union(LITERAL_PATTERN_START_SET);
impl<'src> Parser<'src> {
/// Returns `true` if the current token is a valid start of a pattern.
pub(super) fn at_pattern_start(&self) -> bool {
self.at_ts(PATTERN_START_SET)
}
/// Returns `true` if the current token is a valid start of a mapping pattern.
pub(super) fn at_mapping_pattern_start(&self) -> bool {
self.at_ts(MAPPING_PATTERN_START_SET)
}
/// Entry point to start parsing a pattern.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-patterns>
pub(super) fn parse_match_patterns(&mut self) -> Pattern {
let start = self.node_start();
let pattern = self.parse_match_pattern();
if self.at(TokenKind::Comma) {
Pattern::MatchSequence(self.parse_sequence_match_pattern(pattern, start, None))
} else {
pattern
}
}
/// Parses an `or_pattern` or an `as_pattern`.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-pattern>
fn parse_match_pattern(&mut self) -> Pattern {
let start = self.node_start();
let mut lhs = self.parse_match_pattern_lhs();
// Or pattern
if self.at(TokenKind::Vbar) {
let mut patterns = vec![lhs];
let mut progress = ParserProgress::default();
while self.eat(TokenKind::Vbar) {
progress.assert_progressing(self);
let pattern = self.parse_match_pattern_lhs();
patterns.push(pattern);
}
lhs = Pattern::MatchOr(ast::PatternMatchOr {
range: self.node_range(start),
patterns,
});
}
// As pattern
if self.eat(TokenKind::As) {
let ident = self.parse_identifier();
lhs = Pattern::MatchAs(ast::PatternMatchAs {
range: self.node_range(start),
name: Some(ident),
pattern: Some(Box::new(lhs)),
});
}
lhs
}
/// Parses a pattern.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-closed_pattern>
fn parse_match_pattern_lhs(&mut self) -> Pattern {
let start = self.node_start();
let mut lhs = match self.current_token_kind() {
TokenKind::Lbrace => Pattern::MatchMapping(self.parse_match_pattern_mapping()),
TokenKind::Star => Pattern::MatchStar(self.parse_match_pattern_star()),
TokenKind::Lpar | TokenKind::Lsqb => self.parse_delimited_match_pattern(),
_ => self.parse_match_pattern_literal(),
};
if self.at(TokenKind::Lpar) {
lhs = Pattern::MatchClass(self.parse_match_pattern_class(lhs, start));
}
if self.at(TokenKind::Plus) || self.at(TokenKind::Minus) {
let (operator_token, _) = self.next_token();
let operator = if matches!(operator_token, Tok::Plus) {
Operator::Add
} else {
Operator::Sub
};
let lhs_value = if let Pattern::MatchValue(lhs) = lhs {
if !matches!(&*lhs.value, Expr::NumberLiteral(_) | Expr::UnaryOp(_)) {
self.add_error(
ParseErrorType::OtherError("invalid lhs pattern".to_string()),
&lhs,
);
}
lhs.value
} else {
self.add_error(
ParseErrorType::OtherError("invalid lhs pattern".to_string()),
&lhs,
);
// In case it's not a valid LHS pattern, we'll use an empty `Expr::Name`
// to indicate that.
Box::new(Expr::Name(ast::ExprName {
id: String::new(),
ctx: ExprContext::Invalid,
range: lhs.range(),
}))
};
let rhs_pattern = self.parse_match_pattern_lhs();
let rhs_value = if let Pattern::MatchValue(rhs) = rhs_pattern {
if !matches!(
&*rhs.value,
Expr::NumberLiteral(ast::ExprNumberLiteral {
value: ast::Number::Complex { .. },
..
})
) {
self.add_error(
ParseErrorType::OtherError(
"imaginary number required in complex literal".to_string(),
),
&rhs,
);
}
rhs.value
} else {
self.add_error(
ParseErrorType::OtherError("invalid rhs pattern".to_string()),
rhs_pattern.range(),
);
// In case it's not a valid RHS pattern, we'll use an empty `Expr::Name`
// to indicate that.
Box::new(Expr::Name(ast::ExprName {
id: String::new(),
ctx: ExprContext::Invalid,
range: rhs_pattern.range(),
}))
};
let range = self.node_range(start);
return Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(Expr::BinOp(ast::ExprBinOp {
left: lhs_value,
op: operator,
right: rhs_value,
range,
})),
range,
});
}
lhs
}
/// Parses a mapping pattern.
///
/// # Panics
///
/// If the parser isn't positioned at a `{` token.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#mapping-patterns>
fn parse_match_pattern_mapping(&mut self) -> ast::PatternMatchMapping {
let start = self.node_start();
self.bump(TokenKind::Lbrace);
let mut keys = vec![];
let mut patterns = vec![];
let mut rest = None;
self.parse_comma_separated_list(RecoveryContextKind::MatchPatternMapping, |parser| {
if parser.eat(TokenKind::DoubleStar) {
rest = Some(parser.parse_identifier());
} else {
let key = match parser.parse_match_pattern_lhs() {
Pattern::MatchValue(ast::PatternMatchValue { value, .. }) => *value,
Pattern::MatchSingleton(ast::PatternMatchSingleton { value, range }) => {
match value {
Singleton::None => Expr::NoneLiteral(ast::ExprNoneLiteral { range }),
Singleton::True => {
Expr::BooleanLiteral(ast::ExprBooleanLiteral { value: true, range })
}
Singleton::False => Expr::BooleanLiteral(ast::ExprBooleanLiteral {
value: false,
range,
}),
}
}
pattern => {
parser.add_error(
ParseErrorType::OtherError("invalid mapping pattern key".to_string()),
&pattern,
);
Expr::Name(ast::ExprName {
id: String::new(),
ctx: ExprContext::Invalid,
range: pattern.range(),
})
}
};
keys.push(key);
parser.expect(TokenKind::Colon);
patterns.push(parser.parse_match_pattern());
}
});
// TODO(dhruvmanila): There can't be any other pattern after a `**` pattern.
// TODO(dhruvmanila): Duplicate literal keys should raise a SyntaxError.
self.expect(TokenKind::Rbrace);
ast::PatternMatchMapping {
range: self.node_range(start),
keys,
patterns,
rest,
}
}
/// Parses a star pattern.
///
/// # Panics
///
/// If the parser isn't positioned at a `*` token.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-star_pattern>
fn parse_match_pattern_star(&mut self) -> ast::PatternMatchStar {
let start = self.node_start();
self.bump(TokenKind::Star);
let ident = self.parse_identifier();
ast::PatternMatchStar {
range: self.node_range(start),
name: if ident.is_valid() && ident.id == "_" {
None
} else {
Some(ident)
},
}
}
/// Entry point to start parsing a sequence pattern.
///
/// # Panics
///
/// If the parser isn't positioned at a `(` or `[` token.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns>
fn parse_delimited_match_pattern(&mut self) -> Pattern {
let start = self.node_start();
let parentheses = if self.eat(TokenKind::Lpar) {
SequenceMatchPatternParentheses::Tuple
} else {
self.bump(TokenKind::Lsqb);
SequenceMatchPatternParentheses::List
};
if matches!(
self.current_token_kind(),
TokenKind::Newline | TokenKind::Colon
) {
self.add_error(
ParseErrorType::OtherError(format!(
"missing `{closing}`",
closing = if parentheses.is_list() { "]" } else { ")" }
)),
self.current_token_range(),
);
}
if self.eat(parentheses.closing_kind()) {
return Pattern::MatchSequence(ast::PatternMatchSequence {
patterns: vec![],
range: self.node_range(start),
});
}
let mut pattern = self.parse_match_pattern();
if parentheses.is_list() || self.at(TokenKind::Comma) {
pattern = Pattern::MatchSequence(self.parse_sequence_match_pattern(
pattern,
start,
Some(parentheses),
));
} else {
self.expect(parentheses.closing_kind());
}
pattern
}
/// Parses the rest of a sequence pattern, given the first element.
///
/// If the `parentheses` is `None`, it is an [open sequence pattern].
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns>
///
/// [open sequence pattern]: https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-open_sequence_pattern
fn parse_sequence_match_pattern(
&mut self,
first_element: Pattern,
start: TextSize,
parentheses: Option<SequenceMatchPatternParentheses>,
) -> ast::PatternMatchSequence {
if parentheses.is_some_and(|parentheses| {
self.at(parentheses.closing_kind()) || self.peek_nth(1) == parentheses.closing_kind()
}) {
// The comma is optional if it is a single-element sequence
self.eat(TokenKind::Comma);
} else {
self.expect(TokenKind::Comma);
}
let mut patterns = vec![first_element];
self.parse_comma_separated_list(
RecoveryContextKind::SequenceMatchPattern(parentheses),
|parser| patterns.push(parser.parse_match_pattern()),
);
if let Some(parentheses) = parentheses {
self.expect(parentheses.closing_kind());
}
ast::PatternMatchSequence {
range: self.node_range(start),
patterns,
}
}
/// Parses a literal pattern.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-literal_pattern>
fn parse_match_pattern_literal(&mut self) -> Pattern {
let start = self.node_start();
match self.current_token_kind() {
TokenKind::None => {
self.bump(TokenKind::None);
Pattern::MatchSingleton(ast::PatternMatchSingleton {
value: Singleton::None,
range: self.node_range(start),
})
}
TokenKind::True => {
self.bump(TokenKind::True);
Pattern::MatchSingleton(ast::PatternMatchSingleton {
value: Singleton::True,
range: self.node_range(start),
})
}
TokenKind::False => {
self.bump(TokenKind::False);
Pattern::MatchSingleton(ast::PatternMatchSingleton {
value: Singleton::False,
range: self.node_range(start),
})
}
TokenKind::String | TokenKind::FStringStart => {
let str = self.parse_strings();
Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(str),
range: self.node_range(start),
})
}
TokenKind::Complex => {
let (Tok::Complex { real, imag }, _) = self.bump(TokenKind::Complex) else {
unreachable!()
};
let range = self.node_range(start);
Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(Expr::NumberLiteral(ast::ExprNumberLiteral {
value: Number::Complex { real, imag },
range,
})),
range,
})
}
TokenKind::Int => {
let (Tok::Int { value }, _) = self.bump(TokenKind::Int) else {
unreachable!()
};
let range = self.node_range(start);
Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(Expr::NumberLiteral(ast::ExprNumberLiteral {
value: Number::Int(value),
range,
})),
range,
})
}
TokenKind::Float => {
let (Tok::Float { value }, _) = self.bump(TokenKind::Float) else {
unreachable!()
};
let range = self.node_range(start);
Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(Expr::NumberLiteral(ast::ExprNumberLiteral {
value: Number::Float(value),
range,
})),
range,
})
}
TokenKind::Name if self.peek_nth(1) == TokenKind::Dot => {
let (Tok::Name { name }, _) = self.bump(TokenKind::Name) else {
unreachable!()
};
let id = Expr::Name(ast::ExprName {
id: name.to_string(),
ctx: ExprContext::Load,
range: self.node_range(start),
});
let attribute = self.parse_attr_expr_for_match_pattern(id, start);
Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(attribute),
range: self.node_range(start),
})
}
TokenKind::Name => {
let (Tok::Name { name }, _) = self.bump(TokenKind::Name) else {
unreachable!()
};
let range = self.node_range(start);
Pattern::MatchAs(ast::PatternMatchAs {
range,
pattern: None,
name: if name.contains('_') {
None
} else {
Some(ast::Identifier {
id: name.to_string(),
range,
})
},
})
}
TokenKind::Minus
if matches!(
self.peek_nth(1),
TokenKind::Int | TokenKind::Float | TokenKind::Complex
) =>
{
let parsed_expr = self.parse_lhs_expression();
let range = self.node_range(start);
Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(parsed_expr.expr),
range,
})
}
kind => {
// Upon encountering an unexpected token, return a `Pattern::MatchValue` containing
// an empty `Expr::Name`.
let invalid_node = if kind.is_keyword() {
Expr::Name(self.parse_name())
} else {
self.add_error(
ParseErrorType::OtherError("Expected a pattern".to_string()),
self.current_token_range(),
);
Expr::Name(ast::ExprName {
range: self.missing_node_range(),
id: String::new(),
ctx: ExprContext::Invalid,
})
};
Pattern::MatchValue(ast::PatternMatchValue {
value: Box::new(invalid_node),
range: self.missing_node_range(),
})
}
}
}
/// Parses an attribute expression until the current token is not a `.`.
fn parse_attr_expr_for_match_pattern(&mut self, mut lhs: Expr, start: TextSize) -> Expr {
while self.current_token_kind() == TokenKind::Dot {
lhs = Expr::Attribute(self.parse_attribute_expression(lhs, start));
}
lhs
}
/// Parses the [pattern arguments] in a class pattern.
///
/// # Panics
///
/// If the parser isn't positioned at a `(` token.
///
/// See: <https://docs.python.org/3/reference/compound_stmts.html#class-patterns>
///
/// [pattern arguments]: https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-pattern_arguments
fn parse_match_pattern_class(
&mut self,
cls: Pattern,
start: TextSize,
) -> ast::PatternMatchClass {
let arguments_start = self.node_start();
self.bump(TokenKind::Lpar);
let mut patterns = vec![];
let mut keywords = vec![];
let mut has_seen_pattern = false;
let mut has_seen_keyword_pattern = false;
self.parse_comma_separated_list(
RecoveryContextKind::MatchPatternClassArguments,
|parser| {
let pattern_start = parser.node_start();
let pattern = parser.parse_match_pattern();
if parser.eat(TokenKind::Equal) {
has_seen_pattern = false;
has_seen_keyword_pattern = true;
let value_pattern = parser.parse_match_pattern();
// Key can only be an identifier
if let Pattern::MatchAs(ast::PatternMatchAs {
name: Some(attr), ..
}) = pattern
{
keywords.push(ast::PatternKeyword {
attr,
pattern: value_pattern,
range: parser.node_range(pattern_start),
});
} else {
// In case it's not a valid keyword pattern, we'll add an empty identifier
// to indicate that. This is to avoid dropping the parsed value pattern.
keywords.push(ast::PatternKeyword {
attr: ast::Identifier {
id: String::new(),
range: parser.missing_node_range(),
},
pattern: value_pattern,
range: parser.node_range(pattern_start),
});
parser.add_error(
ParseErrorType::OtherError("Invalid keyword pattern".to_string()),
parser.node_range(pattern_start),
);
}
} else {
has_seen_pattern = true;
patterns.push(pattern);
}
if has_seen_keyword_pattern && has_seen_pattern {
parser.add_error(
ParseErrorType::OtherError(
"pattern not allowed after keyword pattern".to_string(),
),
parser.node_range(pattern_start),
);
}
},
);
self.expect(TokenKind::Rpar);
let arguments_range = self.node_range(arguments_start);
let cls = match cls {
Pattern::MatchAs(ast::PatternMatchAs {
name: Some(ident), ..
}) => Box::new(Expr::Name(if ident.is_valid() {
ast::ExprName {
range: ident.range(),
id: ident.id,
ctx: ExprContext::Load,
}
} else {
ast::ExprName {
range: ident.range(),
id: String::new(),
ctx: ExprContext::Invalid,
}
})),
Pattern::MatchValue(ast::PatternMatchValue { value, range: _ })
if matches!(value.as_ref(), Expr::Attribute(_)) =>
{
value
}
pattern => {
self.add_error(
ParseErrorType::OtherError("invalid value for a class pattern".to_string()),
&pattern,
);
Box::new(Expr::Name(ast::ExprName {
id: String::new(),
ctx: ExprContext::Invalid,
range: pattern.range(),
}))
}
};
ast::PatternMatchClass {
cls,
arguments: ast::PatternArguments {
patterns,
keywords,
range: arguments_range,
},
range: self.node_range(start),
}
}
}

View File

@@ -0,0 +1,36 @@
use crate::parser::Parser;
use crate::TokenKind;
use ruff_text_size::TextSize;
/// Captures the progress of the parser and allows to test if the parsing is still making progress
#[derive(Debug, Copy, Clone, Default)]
pub(super) struct ParserProgress(Option<(TokenKind, TextSize)>);
impl ParserProgress {
/// Returns true if the parser has passed this position
#[inline]
fn has_progressed(self, p: &Parser) -> bool {
match self.0 {
None => true,
Some(snapshot) => snapshot != (p.current_token_kind(), p.current_token_range().start()),
}
}
/// Asserts that the parsing is still making progress.
///
/// # Panics
///
/// Panics if the parser hasn't progressed since the last call.
#[inline]
pub(super) fn assert_progressing(&mut self, p: &Parser) {
assert!(
self.has_progressed(p),
"The parser is no longer progressing. Stuck at '{}' {:?}:{:?}",
p.src_text(p.current_token_range()),
p.current_token_kind(),
p.current_token_range(),
);
self.0 = Some((p.current_token_kind(), p.current_token_range().start()));
}
}

View File

@@ -1,5 +1,5 @@
---
source: crates/ruff_python_parser/src/invalid.rs
source: crates/ruff_python_parser/src/parser/tests.rs
expression: ast
---
Ok(

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,15 @@
mod parser;
mod suite;
// This is a sanity test for what looks like an ipython directive being
// assigned to. Although this doesn't actually parse as an assignment
// statement, but rather, a directive whose value is `foo = 42`.
#[test]
fn ok_ipy_escape_command() {
use crate::Mode;
let src = r"!foo = 42";
let tokens = crate::lexer::lex(src, Mode::Ipython).collect();
let ast = crate::parse_tokens(tokens, src, Mode::Ipython);
insta::assert_debug_snapshot!(ast);
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,136 @@
---
source: crates/ruff_python_parser/src/parser/tests/parser.rs
expression: "parse(\"\nx: int\n(y): 1 + 2\nvar: tuple[int] | int = 1,\n\")"
---
Program {
ast: Module(
ModModule {
range: 0..46,
body: [
AnnAssign(
StmtAnnAssign {
range: 1..7,
target: Name(
ExprName {
range: 1..2,
id: "x",
ctx: Store,
},
),
annotation: Name(
ExprName {
range: 4..7,
id: "int",
ctx: Load,
},
),
value: None,
simple: true,
},
),
AnnAssign(
StmtAnnAssign {
range: 8..18,
target: Name(
ExprName {
range: 9..10,
id: "y",
ctx: Store,
},
),
annotation: BinOp(
ExprBinOp {
range: 13..18,
left: NumberLiteral(
ExprNumberLiteral {
range: 13..14,
value: Int(
1,
),
},
),
op: Add,
right: NumberLiteral(
ExprNumberLiteral {
range: 17..18,
value: Int(
2,
),
},
),
},
),
value: None,
simple: false,
},
),
AnnAssign(
StmtAnnAssign {
range: 19..45,
target: Name(
ExprName {
range: 19..22,
id: "var",
ctx: Store,
},
),
annotation: BinOp(
ExprBinOp {
range: 24..40,
left: Subscript(
ExprSubscript {
range: 24..34,
value: Name(
ExprName {
range: 24..29,
id: "tuple",
ctx: Load,
},
),
slice: Name(
ExprName {
range: 30..33,
id: "int",
ctx: Load,
},
),
ctx: Load,
},
),
op: BitOr,
right: Name(
ExprName {
range: 37..40,
id: "int",
ctx: Load,
},
),
},
),
value: Some(
Tuple(
ExprTuple {
range: 43..45,
elts: [
NumberLiteral(
ExprNumberLiteral {
range: 43..44,
value: Int(
1,
),
},
),
],
ctx: Load,
parenthesized: false,
},
),
),
simple: true,
},
),
],
},
),
parse_errors: [],
}

View File

@@ -0,0 +1,264 @@
---
source: crates/ruff_python_parser/src/parser/tests/parser.rs
expression: "parse(\"\nx = 1\n[] = *l\n() = *t\na, b = ab\n*a = 1 + 2\na = b = c\nfoo.bar = False\nbaz[0] = 42\n\")"
---
Program {
ast: Module(
ModModule {
range: 0..82,
body: [
Assign(
StmtAssign {
range: 1..6,
targets: [
Name(
ExprName {
range: 1..2,
id: "x",
ctx: Store,
},
),
],
value: NumberLiteral(
ExprNumberLiteral {
range: 5..6,
value: Int(
1,
),
},
),
},
),
Assign(
StmtAssign {
range: 7..14,
targets: [
List(
ExprList {
range: 7..9,
elts: [],
ctx: Store,
},
),
],
value: Starred(
ExprStarred {
range: 12..14,
value: Name(
ExprName {
range: 13..14,
id: "l",
ctx: Load,
},
),
ctx: Load,
},
),
},
),
Assign(
StmtAssign {
range: 15..22,
targets: [
Tuple(
ExprTuple {
range: 15..17,
elts: [],
ctx: Store,
parenthesized: true,
},
),
],
value: Starred(
ExprStarred {
range: 20..22,
value: Name(
ExprName {
range: 21..22,
id: "t",
ctx: Load,
},
),
ctx: Load,
},
),
},
),
Assign(
StmtAssign {
range: 23..32,
targets: [
Tuple(
ExprTuple {
range: 23..27,
elts: [
Name(
ExprName {
range: 23..24,
id: "a",
ctx: Store,
},
),
Name(
ExprName {
range: 26..27,
id: "b",
ctx: Store,
},
),
],
ctx: Store,
parenthesized: false,
},
),
],
value: Name(
ExprName {
range: 30..32,
id: "ab",
ctx: Load,
},
),
},
),
Assign(
StmtAssign {
range: 33..43,
targets: [
Starred(
ExprStarred {
range: 33..35,
value: Name(
ExprName {
range: 34..35,
id: "a",
ctx: Store,
},
),
ctx: Store,
},
),
],
value: BinOp(
ExprBinOp {
range: 38..43,
left: NumberLiteral(
ExprNumberLiteral {
range: 38..39,
value: Int(
1,
),
},
),
op: Add,
right: NumberLiteral(
ExprNumberLiteral {
range: 42..43,
value: Int(
2,
),
},
),
},
),
},
),
Assign(
StmtAssign {
range: 44..53,
targets: [
Name(
ExprName {
range: 44..45,
id: "a",
ctx: Store,
},
),
Name(
ExprName {
range: 48..49,
id: "b",
ctx: Store,
},
),
],
value: Name(
ExprName {
range: 52..53,
id: "c",
ctx: Load,
},
),
},
),
Assign(
StmtAssign {
range: 54..69,
targets: [
Attribute(
ExprAttribute {
range: 54..61,
value: Name(
ExprName {
range: 54..57,
id: "foo",
ctx: Load,
},
),
attr: Identifier {
id: "bar",
range: 58..61,
},
ctx: Store,
},
),
],
value: BooleanLiteral(
ExprBooleanLiteral {
range: 64..69,
value: false,
},
),
},
),
Assign(
StmtAssign {
range: 70..81,
targets: [
Subscript(
ExprSubscript {
range: 70..76,
value: Name(
ExprName {
range: 70..73,
id: "baz",
ctx: Load,
},
),
slice: NumberLiteral(
ExprNumberLiteral {
range: 74..75,
value: Int(
0,
),
},
),
ctx: Store,
},
),
],
value: NumberLiteral(
ExprNumberLiteral {
range: 79..81,
value: Int(
42,
),
},
),
},
),
],
},
),
parse_errors: [],
}

View File

@@ -0,0 +1,155 @@
---
source: crates/ruff_python_parser/src/parser/tests/parser.rs
expression: "parse(\"\nasync def f():\n ...\n\nasync for i in iter:\n ...\n\nasync with x:\n ...\n\n@a\nasync def x():\n ...\n\")"
---
Program {
ast: Module(
ModModule {
range: 0..104,
body: [
FunctionDef(
StmtFunctionDef {
range: 1..23,
is_async: true,
decorator_list: [],
name: Identifier {
id: "f",
range: 11..12,
},
type_params: None,
parameters: Parameters {
range: 12..14,
posonlyargs: [],
args: [],
vararg: None,
kwonlyargs: [],
kwarg: None,
},
returns: None,
body: [
Expr(
StmtExpr {
range: 20..23,
value: EllipsisLiteral(
ExprEllipsisLiteral {
range: 20..23,
},
),
},
),
],
},
),
For(
StmtFor {
range: 25..53,
is_async: true,
target: Name(
ExprName {
range: 35..36,
id: "i",
ctx: Store,
},
),
iter: Name(
ExprName {
range: 40..44,
id: "iter",
ctx: Load,
},
),
body: [
Expr(
StmtExpr {
range: 50..53,
value: EllipsisLiteral(
ExprEllipsisLiteral {
range: 50..53,
},
),
},
),
],
orelse: [],
},
),
With(
StmtWith {
range: 55..76,
is_async: true,
items: [
WithItem {
range: 66..67,
context_expr: Name(
ExprName {
range: 66..67,
id: "x",
ctx: Load,
},
),
optional_vars: None,
},
],
body: [
Expr(
StmtExpr {
range: 73..76,
value: EllipsisLiteral(
ExprEllipsisLiteral {
range: 73..76,
},
),
},
),
],
},
),
FunctionDef(
StmtFunctionDef {
range: 78..103,
is_async: true,
decorator_list: [
Decorator {
range: 78..80,
expression: Name(
ExprName {
range: 79..80,
id: "a",
ctx: Load,
},
),
},
],
name: Identifier {
id: "x",
range: 91..92,
},
type_params: None,
parameters: Parameters {
range: 92..94,
posonlyargs: [],
args: [],
vararg: None,
kwonlyargs: [],
kwarg: None,
},
returns: None,
body: [
Expr(
StmtExpr {
range: 100..103,
value: EllipsisLiteral(
ExprEllipsisLiteral {
range: 100..103,
},
),
},
),
],
},
),
],
},
),
parse_errors: [],
}

View File

@@ -0,0 +1,184 @@
---
source: crates/ruff_python_parser/src/parser/tests/parser.rs
expression: "parse(\"\nvalue.attr\nvalue.attr()\nvalue().attr\nvalue().attr().foo\nvalue.attr.foo\n\")"
---
Program {
ast: Module(
ModModule {
range: 0..72,
body: [
Expr(
StmtExpr {
range: 1..11,
value: Attribute(
ExprAttribute {
range: 1..11,
value: Name(
ExprName {
range: 1..6,
id: "value",
ctx: Load,
},
),
attr: Identifier {
id: "attr",
range: 7..11,
},
ctx: Load,
},
),
},
),
Expr(
StmtExpr {
range: 12..24,
value: Call(
ExprCall {
range: 12..24,
func: Attribute(
ExprAttribute {
range: 12..22,
value: Name(
ExprName {
range: 12..17,
id: "value",
ctx: Load,
},
),
attr: Identifier {
id: "attr",
range: 18..22,
},
ctx: Load,
},
),
arguments: Arguments {
range: 22..24,
args: [],
keywords: [],
},
},
),
},
),
Expr(
StmtExpr {
range: 25..37,
value: Attribute(
ExprAttribute {
range: 25..37,
value: Call(
ExprCall {
range: 25..32,
func: Name(
ExprName {
range: 25..30,
id: "value",
ctx: Load,
},
),
arguments: Arguments {
range: 30..32,
args: [],
keywords: [],
},
},
),
attr: Identifier {
id: "attr",
range: 33..37,
},
ctx: Load,
},
),
},
),
Expr(
StmtExpr {
range: 38..56,
value: Attribute(
ExprAttribute {
range: 38..56,
value: Call(
ExprCall {
range: 38..52,
func: Attribute(
ExprAttribute {
range: 38..50,
value: Call(
ExprCall {
range: 38..45,
func: Name(
ExprName {
range: 38..43,
id: "value",
ctx: Load,
},
),
arguments: Arguments {
range: 43..45,
args: [],
keywords: [],
},
},
),
attr: Identifier {
id: "attr",
range: 46..50,
},
ctx: Load,
},
),
arguments: Arguments {
range: 50..52,
args: [],
keywords: [],
},
},
),
attr: Identifier {
id: "foo",
range: 53..56,
},
ctx: Load,
},
),
},
),
Expr(
StmtExpr {
range: 57..71,
value: Attribute(
ExprAttribute {
range: 57..71,
value: Attribute(
ExprAttribute {
range: 57..67,
value: Name(
ExprName {
range: 57..62,
id: "value",
ctx: Load,
},
),
attr: Identifier {
id: "attr",
range: 63..67,
},
ctx: Load,
},
),
attr: Identifier {
id: "foo",
range: 68..71,
},
ctx: Load,
},
),
},
),
],
},
),
parse_errors: [],
}

View File

@@ -0,0 +1,329 @@
---
source: crates/ruff_python_parser/src/parser/tests/parser.rs
expression: "parse(\"\na += 1\na *= b\na -= 1\na /= a + 1\na //= (a + b) - c ** 2\na @= [1,2]\na %= x\na |= 1\na <<= 2\na >>= 2\na ^= ...\na **= 42\n\")"
---
Program {
ast: Module(
ModModule {
range: 0..115,
body: [
AugAssign(
StmtAugAssign {
range: 1..7,
target: Name(
ExprName {
range: 1..2,
id: "a",
ctx: Store,
},
),
op: Add,
value: NumberLiteral(
ExprNumberLiteral {
range: 6..7,
value: Int(
1,
),
},
),
},
),
AugAssign(
StmtAugAssign {
range: 8..14,
target: Name(
ExprName {
range: 8..9,
id: "a",
ctx: Store,
},
),
op: Mult,
value: Name(
ExprName {
range: 13..14,
id: "b",
ctx: Load,
},
),
},
),
AugAssign(
StmtAugAssign {
range: 15..21,
target: Name(
ExprName {
range: 15..16,
id: "a",
ctx: Store,
},
),
op: Sub,
value: NumberLiteral(
ExprNumberLiteral {
range: 20..21,
value: Int(
1,
),
},
),
},
),
AugAssign(
StmtAugAssign {
range: 22..32,
target: Name(
ExprName {
range: 22..23,
id: "a",
ctx: Store,
},
),
op: Div,
value: BinOp(
ExprBinOp {
range: 27..32,
left: Name(
ExprName {
range: 27..28,
id: "a",
ctx: Load,
},
),
op: Add,
right: NumberLiteral(
ExprNumberLiteral {
range: 31..32,
value: Int(
1,
),
},
),
},
),
},
),
AugAssign(
StmtAugAssign {
range: 33..55,
target: Name(
ExprName {
range: 33..34,
id: "a",
ctx: Store,
},
),
op: FloorDiv,
value: BinOp(
ExprBinOp {
range: 39..55,
left: BinOp(
ExprBinOp {
range: 40..45,
left: Name(
ExprName {
range: 40..41,
id: "a",
ctx: Load,
},
),
op: Add,
right: Name(
ExprName {
range: 44..45,
id: "b",
ctx: Load,
},
),
},
),
op: Sub,
right: BinOp(
ExprBinOp {
range: 49..55,
left: Name(
ExprName {
range: 49..50,
id: "c",
ctx: Load,
},
),
op: Pow,
right: NumberLiteral(
ExprNumberLiteral {
range: 54..55,
value: Int(
2,
),
},
),
},
),
},
),
},
),
AugAssign(
StmtAugAssign {
range: 56..66,
target: Name(
ExprName {
range: 56..57,
id: "a",
ctx: Store,
},
),
op: MatMult,
value: List(
ExprList {
range: 61..66,
elts: [
NumberLiteral(
ExprNumberLiteral {
range: 62..63,
value: Int(
1,
),
},
),
NumberLiteral(
ExprNumberLiteral {
range: 64..65,
value: Int(
2,
),
},
),
],
ctx: Load,
},
),
},
),
AugAssign(
StmtAugAssign {
range: 67..73,
target: Name(
ExprName {
range: 67..68,
id: "a",
ctx: Store,
},
),
op: Mod,
value: Name(
ExprName {
range: 72..73,
id: "x",
ctx: Load,
},
),
},
),
AugAssign(
StmtAugAssign {
range: 74..80,
target: Name(
ExprName {
range: 74..75,
id: "a",
ctx: Store,
},
),
op: BitOr,
value: NumberLiteral(
ExprNumberLiteral {
range: 79..80,
value: Int(
1,
),
},
),
},
),
AugAssign(
StmtAugAssign {
range: 81..88,
target: Name(
ExprName {
range: 81..82,
id: "a",
ctx: Store,
},
),
op: LShift,
value: NumberLiteral(
ExprNumberLiteral {
range: 87..88,
value: Int(
2,
),
},
),
},
),
AugAssign(
StmtAugAssign {
range: 89..96,
target: Name(
ExprName {
range: 89..90,
id: "a",
ctx: Store,
},
),
op: RShift,
value: NumberLiteral(
ExprNumberLiteral {
range: 95..96,
value: Int(
2,
),
},
),
},
),
AugAssign(
StmtAugAssign {
range: 97..105,
target: Name(
ExprName {
range: 97..98,
id: "a",
ctx: Store,
},
),
op: BitXor,
value: EllipsisLiteral(
ExprEllipsisLiteral {
range: 102..105,
},
),
},
),
AugAssign(
StmtAugAssign {
range: 106..114,
target: Name(
ExprName {
range: 106..107,
id: "a",
ctx: Store,
},
),
op: Pow,
value: NumberLiteral(
ExprNumberLiteral {
range: 112..114,
value: Int(
42,
),
},
),
},
),
],
},
),
parse_errors: [],
}

Some files were not shown because too many files have changed in this diff Show More