ad1a8da4d1126eff1ec59ac4763f2c44e3dc33f4
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ad1a8da4d1 |
[red-knot] Check for invalid overload usages (#17609)
## Summary Part of #15383, this PR adds the core infrastructure to check for invalid overloads and adds a diagnostic to raise if there are < 2 overloads for a given definition. ### Design notes The requirements to check the overloads are: * Requires `FunctionType` which has the `to_overloaded` method * The `FunctionType` **should** be for the function that is either the implementation or the last overload if the implementation doesn't exists * Avoid checking any `FunctionType` that are part of an overload chain * Consider visibility constraints This required a couple of iteration to make sure all of the above requirements are fulfilled. #### 1. Use a set to deduplicate The logic would first collect all the `FunctionType` that are part of the overload chain except for the implementation or the last overload if the implementation doesn't exists. Then, when iterating over all the function declarations within the scope, we'd avoid checking these functions. But, this approach would fail to consider visibility constraints as certain overloads _can_ be behind a version check. Those aren't part of the overload chain but those aren't a separate overload chain either. <details><summary>Implementation:</summary> <p> ```rs fn check_overloaded_functions(&mut self) { let function_definitions = || { self.types .declarations .iter() .filter_map(|(definition, ty)| { // Filter out function literals that result from anything other than a function // definition e.g., imports. if let DefinitionKind::Function(function) = definition.kind(self.db()) { ty.inner_type() .into_function_literal() .map(|ty| (ty, definition.symbol(self.db()), function.node())) } else { None } }) }; // A set of all the functions that are part of an overloaded function definition except for // the implementation function and the last overload in case the implementation doesn't // exists. This allows us to collect all the function definitions that needs to be skipped // when checking for invalid overload usages. let mut overloads: HashSet<FunctionType<'db>> = HashSet::default(); for (function, _) in function_definitions() { let Some(overloaded) = function.to_overloaded(self.db()) else { continue; }; if overloaded.implementation.is_some() { overloads.extend(overloaded.overloads.iter().copied()); } else if let Some((_, previous_overloads)) = overloaded.overloads.split_last() { overloads.extend(previous_overloads.iter().copied()); } } for (function, function_node) in function_definitions() { let Some(overloaded) = function.to_overloaded(self.db()) else { continue; }; if overloads.contains(&function) { continue; } // At this point, the `function` variable is either the implementation function or the // last overloaded function if the implementation doesn't exists. if overloaded.overloads.len() < 2 { if let Some(builder) = self .context .report_lint(&INVALID_OVERLOAD, &function_node.name) { let mut diagnostic = builder.into_diagnostic(format_args!( "Function `{}` requires at least two overloads", &function_node.name )); if let Some(first_overload) = overloaded.overloads.first() { diagnostic.annotate( self.context .secondary(first_overload.focus_range(self.db())) .message(format_args!("Only one overload defined here")), ); } } } } } ``` </p> </details> #### 2. Define a `predecessor` query The `predecessor` query would return the previous `FunctionType` for the given `FunctionType` i.e., the current logic would be extracted to be a query instead. This could then be used to make sure that we're checking the entire overload chain once. The way this would've been implemented is to have a `to_overloaded` implementation which would take the root of the overload chain instead of the leaf. But, this would require updates to the use-def map to somehow be able to return the _following_ functions for a given definition. #### 3. Create a successor link This is what Pyrefly uses, we'd create a forward link between two functions that are involved in an overload chain. This means that for a given function, we can get the successor function. This could be used to find the _leaf_ of the overload chain which can then be used with the `to_overloaded` method to get the entire overload chain. But, this would also require updating the use-def map to be able to "see" the _following_ function. ### Implementation This leads us to the final implementation that this PR implements which is to consider the overloaded functions using: * Collect all the **function symbols** that are defined **and** called within the same file. This could potentially be an overloaded function * Use the public bindings to get the leaf of the overload chain and use that to get the entire overload chain via `to_overloaded` and perform the check This has a limitation that in case a function redefines an overload, then that overload will not be checked. For example: ```py from typing import overload @overload def f() -> None: ... @overload def f(x: int) -> int: ... # The above overload will not be checked as the below function with the same name # shadows it def f(*args: int) -> int: ... ``` ## Test Plan Update existing mdtest and add snapshot diagnostics. |
||
|
|
44ad201262 |
[red-knot] Add support for overloaded functions (#17366)
## Summary Part of #15383, this PR adds support for overloaded callables. Typing spec: https://typing.python.org/en/latest/spec/overload.html Specifically, it does the following: 1. Update the `FunctionType::signature` method to return signatures from a possibly overloaded callable using a new `FunctionSignature` enum 2. Update `CallableType` to accommodate overloaded callable by updating the inner type to `Box<[Signature]>` 3. Update the relation methods on `CallableType` with logic specific to overloads 4. Update the display of callable type to display a list of signatures enclosed by parenthesis 5. Update `CallableTypeOf` special form to recognize overloaded callable 6. Update subtyping, assignability and fully static check to account for callables (equivalence is planned to be done as a follow-up) For (2), it is required to be done in this PR because otherwise I'd need to add some workaround for `into_callable_type` and I though it would be best to include it in here. For (2), another possible design would be convert `CallableType` in an enum with two variants `CallableType::Single` and `CallableType::Overload` but I decided to go with `Box<[Signature]>` for now to (a) mirror it to be equivalent to `overload` field on `CallableSignature` and (b) to avoid any refactor in this PR. This could be done in a follow-up to better split the two kind of callables. ### Design There were two main candidates on how to represent the overloaded definition: 1. To include it in the existing infrastructure which is what this PR is doing by recognizing all the signatures within the `FunctionType::signature` method 2. To create a new `Overload` type variant <details><summary>For context, this is what I had in mind with the new type variant:</summary> <p> ```rs pub enum Type { FunctionLiteral(FunctionType), Overload(OverloadType), BoundMethod(BoundMethodType), ... } pub struct OverloadType { // FunctionLiteral or BoundMethod overloads: Box<[Type]>, // FunctionLiteral or BoundMethod implementation: Option<Type> } pub struct BoundMethodType { kind: BoundMethodKind, self_instance: Type, } pub enum BoundMethodKind { Function(FunctionType), Overload(OverloadType), } ``` </p> </details> The main reasons to choose (1) are the simplicity in the implementation, reusing the existing infrastructure, avoiding any complications that the new type variant has specifically around the different variants between function and methods which would require the overload type to use `Type` instead. ### Implementation The core logic is how to collect all the overloaded functions. The way this is done in this PR is by recording a **use** on the `Identifier` node that represents the function name in the use-def map. This is then used to fetch the previous symbol using the same name. This way the signatures are going to be propagated from top to bottom (from first overload to the final overload or the implementation) with each function / method. For example: ```py from typing import overload @overload def foo(x: int) -> int: ... @overload def foo(x: str) -> str: ... def foo(x: int | str) -> int | str: return x ``` Here, each definition of `foo` knows about all the signatures that comes before itself. So, the first overload would only see itself, the second would see the first and itself and so on until the implementation or the final overload. This approach required some updates specifically recognizing `Identifier` node to record the function use because it doesn't use `ExprName`. ## Test Plan Update existing test cases which were limited by the overload support and add test cases for the following cases: * Valid overloads as functions, methods, generics, version specific * Invalid overloads as stated in https://typing.python.org/en/latest/spec/overload.html#invalid-overload-definitions (implementation will be done in a follow-up) * Various relation: fully static, subtyping, and assignability (others in a follow-up) ## Ecosystem changes _WIP_ After going through the ecosystem changes (there are a lot!), here's what I've found: We need assignability check between a callable type and a class literal because a lot of builtins are defined as classes in typeshed whose constructor method is overloaded e.g., `map`, `sorted`, `list.sort`, `max`, `min` with the `key` parameter, `collections.abc.defaultdict`, etc. (https://github.com/astral-sh/ruff/issues/17343). This makes up most of the ecosystem diff **roughly 70 diagnostics**. For example: ```py from collections import defaultdict # red-knot: No overload of bound method `__init__` matches arguments [lint:no-matching-overload] defaultdict(int) # red-knot: No overload of bound method `__init__` matches arguments [lint:no-matching-overload] defaultdict(list) class Foo: def __init__(self, x: int): self.x = x # red-knot: No overload of function `__new__` matches arguments [lint:no-matching-overload] map(Foo, ["a", "b", "c"]) ``` Duplicate diagnostics in unpacking (https://github.com/astral-sh/ruff/issues/16514) has **~16 diagnostics**. Support for the `callable` builtin which requires `TypeIs` support. This is **5 diagnostics**. For example: ```py from typing import Any def _(x: Any | None) -> None: if callable(x): # red-knot: `Any | None` # Pyright: `(...) -> object` # mypy: `Any` # pyrefly: `(...) -> object` reveal_type(x) ``` Narrowing on `assert` which has **11 diagnostics**. This is being worked on in https://github.com/astral-sh/ruff/pull/17345. For example: ```py import re match = re.search("", "") assert match match.group() # error: [possibly-unbound-attribute] ``` Others: * `Self`: 2 * Type aliases: 6 * Generics: 3 * Protocols: 13 * Unpacking in comprehension: 1 (https://github.com/astral-sh/ruff/pull/17396) ## Performance Refer to https://github.com/astral-sh/ruff/pull/17366#issuecomment-2814053046. |