[red-knot] Detect semantic syntax errors (#17463)

Summary
--

This PR extends semantic syntax error detection to red-knot. The main
changes here are:

1. Adding `SemanticSyntaxChecker` and `Vec<SemanticSyntaxError>` fields
to the `SemanticIndexBuilder`
2. Calling `SemanticSyntaxChecker::visit_stmt` and `visit_expr` in the
`SemanticIndexBuilder`'s `visit_stmt` and `visit_expr` methods
3. Implementing `SemanticSyntaxContext` for `SemanticIndexBuilder`
4. Adding new mdtests to test the context implementation and show
diagnostics

(3) is definitely the trickiest and required (I think) a minor addition
to the `SemanticIndexBuilder`. I tried to look around for existing code
performing the necessary checks, but I definitely could have missed
something or misused the existing code even when I found it.

There's still one TODO around `global` statement handling. I don't think
there's an existing way to look this up, but I'm happy to work on that
here or in a separate PR. This currently only affects detection of one
error (`LoadBeforeGlobalDeclaration` or
[PLE0118](https://docs.astral.sh/ruff/rules/load-before-global-declaration/)
in ruff), so it's not too big of a problem even if we leave the TODO.

Test Plan
--

New mdtests, as well as new errors for existing mdtests

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This commit is contained in:
Brent Westbrook
2025-04-23 09:52:58 -04:00
committed by GitHub
parent 624f5c6c22
commit e7f38fe74b
13 changed files with 431 additions and 66 deletions

View File

@@ -56,40 +56,41 @@ def _(
def bar() -> None:
return None
def _(
a: 1, # error: [invalid-type-form] "Int literals are not allowed in this context in a type expression"
b: 2.3, # error: [invalid-type-form] "Float literals are not allowed in type expressions"
c: 4j, # error: [invalid-type-form] "Complex literals are not allowed in type expressions"
d: True, # error: [invalid-type-form] "Boolean literals are not allowed in this context in a type expression"
e: int | b"foo", # error: [invalid-type-form] "Bytes literals are not allowed in this context in a type expression"
f: 1 and 2, # error: [invalid-type-form] "Boolean operations are not allowed in type expressions"
g: 1 or 2, # error: [invalid-type-form] "Boolean operations are not allowed in type expressions"
h: (foo := 1), # error: [invalid-type-form] "Named expressions are not allowed in type expressions"
i: not 1, # error: [invalid-type-form] "Unary operations are not allowed in type expressions"
j: lambda: 1, # error: [invalid-type-form] "`lambda` expressions are not allowed in type expressions"
k: 1 if True else 2, # error: [invalid-type-form] "`if` expressions are not allowed in type expressions"
l: await 1, # error: [invalid-type-form] "`await` expressions are not allowed in type expressions"
m: (yield 1), # error: [invalid-type-form] "`yield` expressions are not allowed in type expressions"
n: (yield from [1]), # error: [invalid-type-form] "`yield from` expressions are not allowed in type expressions"
o: 1 < 2, # error: [invalid-type-form] "Comparison expressions are not allowed in type expressions"
p: bar(), # error: [invalid-type-form] "Function calls are not allowed in type expressions"
q: int | f"foo", # error: [invalid-type-form] "F-strings are not allowed in type expressions"
r: [1, 2, 3][1:2], # error: [invalid-type-form] "Slices are not allowed in type expressions"
):
reveal_type(a) # revealed: Unknown
reveal_type(b) # revealed: Unknown
reveal_type(c) # revealed: Unknown
reveal_type(d) # revealed: Unknown
reveal_type(e) # revealed: int | Unknown
reveal_type(f) # revealed: Unknown
reveal_type(g) # revealed: Unknown
reveal_type(h) # revealed: Unknown
reveal_type(i) # revealed: Unknown
reveal_type(j) # revealed: Unknown
reveal_type(k) # revealed: Unknown
reveal_type(p) # revealed: Unknown
reveal_type(q) # revealed: int | Unknown
reveal_type(r) # revealed: @Todo(unknown type subscript)
async def outer(): # avoid unrelated syntax errors on yield, yield from, and await
def _(
a: 1, # error: [invalid-type-form] "Int literals are not allowed in this context in a type expression"
b: 2.3, # error: [invalid-type-form] "Float literals are not allowed in type expressions"
c: 4j, # error: [invalid-type-form] "Complex literals are not allowed in type expressions"
d: True, # error: [invalid-type-form] "Boolean literals are not allowed in this context in a type expression"
e: int | b"foo", # error: [invalid-type-form] "Bytes literals are not allowed in this context in a type expression"
f: 1 and 2, # error: [invalid-type-form] "Boolean operations are not allowed in type expressions"
g: 1 or 2, # error: [invalid-type-form] "Boolean operations are not allowed in type expressions"
h: (foo := 1), # error: [invalid-type-form] "Named expressions are not allowed in type expressions"
i: not 1, # error: [invalid-type-form] "Unary operations are not allowed in type expressions"
j: lambda: 1, # error: [invalid-type-form] "`lambda` expressions are not allowed in type expressions"
k: 1 if True else 2, # error: [invalid-type-form] "`if` expressions are not allowed in type expressions"
l: await 1, # error: [invalid-type-form] "`await` expressions are not allowed in type expressions"
m: (yield 1), # error: [invalid-type-form] "`yield` expressions are not allowed in type expressions"
n: (yield from [1]), # error: [invalid-type-form] "`yield from` expressions are not allowed in type expressions"
o: 1 < 2, # error: [invalid-type-form] "Comparison expressions are not allowed in type expressions"
p: bar(), # error: [invalid-type-form] "Function calls are not allowed in type expressions"
q: int | f"foo", # error: [invalid-type-form] "F-strings are not allowed in type expressions"
r: [1, 2, 3][1:2], # error: [invalid-type-form] "Slices are not allowed in type expressions"
):
reveal_type(a) # revealed: Unknown
reveal_type(b) # revealed: Unknown
reveal_type(c) # revealed: Unknown
reveal_type(d) # revealed: Unknown
reveal_type(e) # revealed: int | Unknown
reveal_type(f) # revealed: Unknown
reveal_type(g) # revealed: Unknown
reveal_type(h) # revealed: Unknown
reveal_type(i) # revealed: Unknown
reveal_type(j) # revealed: Unknown
reveal_type(k) # revealed: Unknown
reveal_type(p) # revealed: Unknown
reveal_type(q) # revealed: int | Unknown
reveal_type(r) # revealed: @Todo(unknown type subscript)
```
## Invalid Collection based AST nodes

View File

@@ -127,8 +127,9 @@ class AsyncIterable:
def __aiter__(self) -> AsyncIterator:
return AsyncIterator()
# revealed: @Todo(async iterables/iterators)
[reveal_type(x) async for x in AsyncIterable()]
async def _():
# revealed: @Todo(async iterables/iterators)
[reveal_type(x) async for x in AsyncIterable()]
```
### Invalid async comprehension
@@ -145,6 +146,7 @@ class Iterable:
def __iter__(self) -> Iterator:
return Iterator()
# revealed: @Todo(async iterables/iterators)
[reveal_type(x) async for x in Iterable()]
async def _():
# revealed: @Todo(async iterables/iterators)
[reveal_type(x) async for x in Iterable()]
```

View File

@@ -0,0 +1,165 @@
# Semantic syntax error diagnostics
## `async` comprehensions in synchronous comprehensions
### Python 3.10
<!-- snapshot-diagnostics -->
Before Python 3.11, `async` comprehensions could not be used within outer sync comprehensions, even
within an `async` function ([CPython issue](https://github.com/python/cpython/issues/77527)):
```toml
[environment]
python-version = "3.10"
```
```py
async def elements(n):
yield n
async def f():
# error: 19 [invalid-syntax] "cannot use an asynchronous comprehension outside of an asynchronous function on Python 3.10 (syntax was added in 3.11)"
return {n: [x async for x in elements(n)] for n in range(3)}
```
If all of the comprehensions are `async`, on the other hand, the code was still valid:
```py
async def test():
return [[x async for x in elements(n)] async for n in range(3)]
```
These are a couple of tricky but valid cases to check that nested scope handling is wired up
correctly in the `SemanticSyntaxContext` trait:
```py
async def f():
[x for x in [1]] and [x async for x in elements(1)]
async def f():
def g():
pass
[x async for x in elements(1)]
```
### Python 3.11
All of these same examples are valid after Python 3.11:
```toml
[environment]
python-version = "3.11"
```
```py
async def elements(n):
yield n
async def f():
return {n: [x async for x in elements(n)] for n in range(3)}
```
## Late `__future__` import
```py
from collections import namedtuple
# error: [invalid-syntax] "__future__ imports must be at the top of the file"
from __future__ import print_function
```
## Invalid annotation
This one might be a bit redundant with the `invalid-type-form` error.
```toml
[environment]
python-version = "3.12"
```
```py
from __future__ import annotations
# error: [invalid-type-form] "Named expressions are not allowed in type expressions"
# error: [invalid-syntax] "named expression cannot be used within a type annotation"
def f() -> (y := 3): ...
```
## Duplicate `match` key
```toml
[environment]
python-version = "3.10"
```
```py
match 2:
# error: [invalid-syntax] "mapping pattern checks duplicate key `"x"`"
case {"x": 1, "x": 2}:
...
```
## `return`, `yield`, `yield from`, and `await` outside function
```py
# error: [invalid-syntax] "`return` statement outside of a function"
return
# error: [invalid-syntax] "`yield` statement outside of a function"
yield
# error: [invalid-syntax] "`yield from` statement outside of a function"
yield from []
# error: [invalid-syntax] "`await` statement outside of a function"
# error: [invalid-syntax] "`await` outside of an asynchronous function"
await 1
def f():
# error: [invalid-syntax] "`await` outside of an asynchronous function"
await 1
```
Generators are evaluated lazily, so `await` is allowed, even outside of a function.
```py
async def g():
yield 1
(x async for x in g())
```
## `await` outside async function
This error includes `await`, `async for`, `async with`, and `async` comprehensions.
```python
async def elements(n):
yield n
def _():
# error: [invalid-syntax] "`await` outside of an asynchronous function"
await 1
# error: [invalid-syntax] "`async for` outside of an asynchronous function"
async for _ in elements(1):
...
# error: [invalid-syntax] "`async with` outside of an asynchronous function"
async with elements(1) as x:
...
# error: [invalid-syntax] "cannot use an asynchronous comprehension outside of an asynchronous function on Python 3.9 (syntax was added in 3.11)"
# error: [invalid-syntax] "asynchronous comprehension outside of an asynchronous function"
[x async for x in elements(1)]
```
## Load before `global` declaration
This should be an error, but it's not yet.
TODO implement `SemanticSyntaxContext::global`
```py
def f():
x = 1
global x
```

View File

@@ -189,7 +189,7 @@ match 42:
...
case [O]:
...
case P | Q:
case P | Q: # error: [invalid-syntax] "name capture `P` makes remaining patterns unreachable"
...
case object(foo=R):
...
@@ -289,7 +289,7 @@ match 42:
...
case [D]:
...
case E | F:
case E | F: # error: [invalid-syntax] "name capture `E` makes remaining patterns unreachable"
...
case object(foo=G):
...
@@ -357,7 +357,7 @@ match 42:
...
case [D]:
...
case E | F:
case E | F: # error: [invalid-syntax] "name capture `E` makes remaining patterns unreachable"
...
case object(foo=G):
...

View File

@@ -0,0 +1,46 @@
---
source: crates/red_knot_test/src/lib.rs
expression: snapshot
---
---
mdtest name: semantic_syntax_errors.md - Semantic syntax error diagnostics - `async` comprehensions in synchronous comprehensions - Python 3.10
mdtest path: crates/red_knot_python_semantic/resources/mdtest/diagnostics/semantic_syntax_errors.md
---
# Python source files
## mdtest_snippet.py
```
1 | async def elements(n):
2 | yield n
3 |
4 | async def f():
5 | # error: 19 [invalid-syntax] "cannot use an asynchronous comprehension outside of an asynchronous function on Python 3.10 (syntax was added in 3.11)"
6 | return {n: [x async for x in elements(n)] for n in range(3)}
7 | async def test():
8 | return [[x async for x in elements(n)] async for n in range(3)]
9 | async def f():
10 | [x for x in [1]] and [x async for x in elements(1)]
11 |
12 | async def f():
13 | def g():
14 | pass
15 | [x async for x in elements(1)]
```
# Diagnostics
```
error: invalid-syntax
--> /src/mdtest_snippet.py:6:19
|
4 | async def f():
5 | # error: 19 [invalid-syntax] "cannot use an asynchronous comprehension outside of an asynchronous function on Python 3.10 (syntax...
6 | return {n: [x async for x in elements(n)] for n in range(3)}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot use an asynchronous comprehension outside of an asynchronous function on Python 3.10 (syntax was added in 3.11)
7 | async def test():
8 | return [[x async for x in elements(n)] async for n in range(3)]
|
```