[red-knot] Method calls and the descriptor protocol (#16121)
## Summary
This PR achieves the following:
* Add support for checking method calls, and inferring return types from
method calls. For example:
```py
reveal_type("abcde".find("abc")) # revealed: int
reveal_type("foo".encode(encoding="utf-8")) # revealed: bytes
"abcde".find(123) # error: [invalid-argument-type]
class C:
def f(self) -> int:
pass
reveal_type(C.f) # revealed: <function `f`>
reveal_type(C().f) # revealed: <bound method: `f` of `C`>
C.f() # error: [missing-argument]
reveal_type(C().f()) # revealed: int
```
* Implement the descriptor protocol, i.e. properly call the `__get__`
method when a descriptor object is accessed through a class object or an
instance of a class. For example:
```py
from typing import Literal
class Ten:
def __get__(self, instance: object, owner: type | None = None) ->
Literal[10]:
return 10
class C:
ten: Ten = Ten()
reveal_type(C.ten) # revealed: Literal[10]
reveal_type(C().ten) # revealed: Literal[10]
```
* Add support for member lookup on intersection types.
* Support type inference for `inspect.getattr_static(obj, attr)` calls.
This was mostly used as a debugging tool during development, but seems
more generally useful. It can be used to bypass the descriptor protocol.
For the example above:
```py
from inspect import getattr_static
reveal_type(getattr_static(C, "ten")) # revealed: Ten
```
* Add a new `Type::Callable(…)` variant with the following sub-variants:
* `Type::Callable(CallableType::BoundMethod(…))` — represents bound
method objects, e.g. `C().f` above
* `Type::Callable(CallableType::MethodWrapperDunderGet(…))` — represents
`f.__get__` where `f` is a function
* `Type::Callable(WrapperDescriptorDunderGet)` — represents
`FunctionType.__get__`
* Add new known classes:
* `types.MethodType`
* `types.MethodWrapperType`
* `types.WrapperDescriptorType`
* `builtins.range`
## Performance analysis
On this branch, we do more work. We need to do more call checking, since
we now check all method calls. We also need to do ~twice as many member
lookups, because we need to check if a `__get__` attribute exists on
accessed members.
A brief analysis on `tomllib` shows that we now call `Type::call` 1780
times, compared to 612 calls before.
## Limitations
* Data descriptors are not yet supported, i.e. we do not infer correct
types for descriptor attribute accesses in `Store` context and do not
check writes to descriptor attributes. I felt like this was something
that could be split out as a follow-up without risking a major
architectural change.
* We currently distinguish between `Type::member` (with descriptor
protocol) and `Type::static_member` (without descriptor protocol). The
former corresponds to `obj.attr`, the latter corresponds to
`getattr_static(obj, "attr")`. However, to model some details correctly,
we would also need to distinguish between a static member lookup *with*
and *without* instance variables. The lookup without instance variables
corresponds to `find_name_in_mro`
[here](https://docs.python.org/3/howto/descriptor.html#invocation-from-an-instance).
We currently approximate both using `member_static`, which leads to two
open TODOs. Changing this would be a larger refactoring of
`Type::own_instance_member`, so I chose to leave it out of this PR.
## Test Plan
* New `call/methods.md` test suite for method calls
* New tests in `descriptor_protocol.md`
* New `call/getattr_static.md` test suite for `inspect.getattr_static`
* Various updated tests
This commit is contained in:
258
crates/red_knot_python_semantic/resources/mdtest/call/methods.md
Normal file
258
crates/red_knot_python_semantic/resources/mdtest/call/methods.md
Normal file
@@ -0,0 +1,258 @@
|
||||
# Methods
|
||||
|
||||
## Background: Functions as descriptors
|
||||
|
||||
> Note: See also this related section in the descriptor guide: [Functions and methods].
|
||||
|
||||
Say we have a simple class `C` with a function definition `f` inside its body:
|
||||
|
||||
```py
|
||||
class C:
|
||||
def f(self, x: int) -> str:
|
||||
return "a"
|
||||
```
|
||||
|
||||
Whenever we access the `f` attribute through the class object itself (`C.f`) or through an instance
|
||||
(`C().f`), this access happens via the descriptor protocol. Functions are (non-data) descriptors
|
||||
because they implement a `__get__` method. This is crucial in making sure that method calls work as
|
||||
expected. In general, the signature of the `__get__` method in the descriptor protocol is
|
||||
`__get__(self, instance, owner)`. The `self` argument is the descriptor object itself (`f`). The
|
||||
passed value for the `instance` argument depends on whether the attribute is accessed from the class
|
||||
object (in which case it is `None`), or from an instance (in which case it is the instance of type
|
||||
`C`). The `owner` argument is the class itself (`C` of type `Literal[C]`). To summarize:
|
||||
|
||||
- `C.f` is equivalent to `getattr_static(C, "f").__get__(None, C)`
|
||||
- `C().f` is equivalent to `getattr_static(C, "f").__get__(C(), C)`
|
||||
|
||||
Here, `inspect.getattr_static` is used to bypass the descriptor protocol and directly access the
|
||||
function attribute. The way the special `__get__` method *on functions* works is as follows. In the
|
||||
former case, if the `instance` argument is `None`, `__get__` simply returns the function itself. In
|
||||
the latter case, it returns a *bound method* object:
|
||||
|
||||
```py
|
||||
from inspect import getattr_static
|
||||
|
||||
reveal_type(getattr_static(C, "f")) # revealed: Literal[f]
|
||||
|
||||
reveal_type(getattr_static(C, "f").__get__) # revealed: <method-wrapper `__get__` of `f`>
|
||||
|
||||
reveal_type(getattr_static(C, "f").__get__(None, C)) # revealed: Literal[f]
|
||||
reveal_type(getattr_static(C, "f").__get__(C(), C)) # revealed: <bound method `f` of `C`>
|
||||
```
|
||||
|
||||
In conclusion, this is why we see the following two types when accessing the `f` attribute on the
|
||||
class object `C` and on an instance `C()`:
|
||||
|
||||
```py
|
||||
reveal_type(C.f) # revealed: Literal[f]
|
||||
reveal_type(C().f) # revealed: <bound method `f` of `C`>
|
||||
```
|
||||
|
||||
A bound method is a callable object that contains a reference to the `instance` that it was called
|
||||
on (can be inspected via `__self__`), and the function object that it refers to (can be inspected
|
||||
via `__func__`):
|
||||
|
||||
```py
|
||||
bound_method = C().f
|
||||
|
||||
reveal_type(bound_method.__self__) # revealed: C
|
||||
reveal_type(bound_method.__func__) # revealed: Literal[f]
|
||||
```
|
||||
|
||||
When we call the bound method, the `instance` is implicitly passed as the first argument (`self`):
|
||||
|
||||
```py
|
||||
reveal_type(C().f(1)) # revealed: str
|
||||
reveal_type(bound_method(1)) # revealed: str
|
||||
```
|
||||
|
||||
When we call the function object itself, we need to pass the `instance` explicitly:
|
||||
|
||||
```py
|
||||
C.f(1) # error: [missing-argument]
|
||||
|
||||
reveal_type(C.f(C(), 1)) # revealed: str
|
||||
```
|
||||
|
||||
When we access methods from derived classes, they will be bound to instances of the derived class:
|
||||
|
||||
```py
|
||||
class D(C):
|
||||
pass
|
||||
|
||||
reveal_type(D().f) # revealed: <bound method `f` of `D`>
|
||||
```
|
||||
|
||||
If we access an attribute on a bound method object itself, it will defer to `types.MethodType`:
|
||||
|
||||
```py
|
||||
reveal_type(bound_method.__hash__) # revealed: <bound method `__hash__` of `MethodType`>
|
||||
```
|
||||
|
||||
If an attribute is not available on the bound method object, it will be looked up on the underlying
|
||||
function object. We model this explicitly, which means that we can access `__kwdefaults__` on bound
|
||||
methods, even though it is not available on `types.MethodType`:
|
||||
|
||||
```py
|
||||
reveal_type(bound_method.__kwdefaults__) # revealed: @Todo(generics) | None
|
||||
```
|
||||
|
||||
## Basic method calls on class objects and instances
|
||||
|
||||
```py
|
||||
class Base:
|
||||
def method_on_base(self, x: int | None) -> str:
|
||||
return "a"
|
||||
|
||||
class Derived(Base):
|
||||
def method_on_derived(self, x: bytes) -> tuple[int, str]:
|
||||
return (1, "a")
|
||||
|
||||
reveal_type(Base().method_on_base(1)) # revealed: str
|
||||
reveal_type(Base.method_on_base(Base(), 1)) # revealed: str
|
||||
|
||||
Base().method_on_base("incorrect") # error: [invalid-argument-type]
|
||||
Base().method_on_base() # error: [missing-argument]
|
||||
Base().method_on_base(1, 2) # error: [too-many-positional-arguments]
|
||||
|
||||
reveal_type(Derived().method_on_base(1)) # revealed: str
|
||||
reveal_type(Derived().method_on_derived(b"abc")) # revealed: tuple[int, str]
|
||||
reveal_type(Derived.method_on_base(Derived(), 1)) # revealed: str
|
||||
reveal_type(Derived.method_on_derived(Derived(), b"abc")) # revealed: tuple[int, str]
|
||||
```
|
||||
|
||||
## Method calls on literals
|
||||
|
||||
### Boolean literals
|
||||
|
||||
```py
|
||||
reveal_type(True.bit_length()) # revealed: int
|
||||
reveal_type(True.as_integer_ratio()) # revealed: tuple[int, Literal[1]]
|
||||
```
|
||||
|
||||
### Integer literals
|
||||
|
||||
```py
|
||||
reveal_type((42).bit_length()) # revealed: int
|
||||
```
|
||||
|
||||
### String literals
|
||||
|
||||
```py
|
||||
reveal_type("abcde".find("abc")) # revealed: int
|
||||
reveal_type("foo".encode(encoding="utf-8")) # revealed: bytes
|
||||
|
||||
"abcde".find(123) # error: [invalid-argument-type]
|
||||
```
|
||||
|
||||
### Bytes literals
|
||||
|
||||
```py
|
||||
reveal_type(b"abcde".startswith(b"abc")) # revealed: bool
|
||||
```
|
||||
|
||||
## Method calls on `LiteralString`
|
||||
|
||||
```py
|
||||
from typing_extensions import LiteralString
|
||||
|
||||
def f(s: LiteralString) -> None:
|
||||
reveal_type(s.find("a")) # revealed: int
|
||||
```
|
||||
|
||||
## Method calls on `tuple`
|
||||
|
||||
```py
|
||||
def f(t: tuple[int, str]) -> None:
|
||||
reveal_type(t.index("a")) # revealed: int
|
||||
```
|
||||
|
||||
## Method calls on unions
|
||||
|
||||
```py
|
||||
from typing import Any
|
||||
|
||||
class A:
|
||||
def f(self) -> int:
|
||||
return 1
|
||||
|
||||
class B:
|
||||
def f(self) -> str:
|
||||
return "a"
|
||||
|
||||
def f(a_or_b: A | B, any_or_a: Any | A):
|
||||
reveal_type(a_or_b.f) # revealed: <bound method `f` of `A`> | <bound method `f` of `B`>
|
||||
reveal_type(a_or_b.f()) # revealed: int | str
|
||||
|
||||
reveal_type(any_or_a.f) # revealed: Any | <bound method `f` of `A`>
|
||||
reveal_type(any_or_a.f()) # revealed: Any | int
|
||||
```
|
||||
|
||||
## Method calls on `KnownInstance` types
|
||||
|
||||
```toml
|
||||
[environment]
|
||||
python-version = "3.12"
|
||||
```
|
||||
|
||||
```py
|
||||
type IntOrStr = int | str
|
||||
|
||||
reveal_type(IntOrStr.__or__) # revealed: <bound method `__or__` of `typing.TypeAliasType`>
|
||||
```
|
||||
|
||||
## Error cases: Calling `__get__` for methods
|
||||
|
||||
The `__get__` method on `types.FunctionType` has the following overloaded signature in typeshed:
|
||||
|
||||
```py
|
||||
from types import FunctionType, MethodType
|
||||
from typing import overload
|
||||
|
||||
@overload
|
||||
def __get__(self, instance: None, owner: type, /) -> FunctionType: ...
|
||||
@overload
|
||||
def __get__(self, instance: object, owner: type | None = None, /) -> MethodType: ...
|
||||
```
|
||||
|
||||
Here, we test that this signature is enforced correctly:
|
||||
|
||||
```py
|
||||
from inspect import getattr_static
|
||||
|
||||
class C:
|
||||
def f(self, x: int) -> str:
|
||||
return "a"
|
||||
|
||||
method_wrapper = getattr_static(C, "f").__get__
|
||||
|
||||
reveal_type(method_wrapper) # revealed: <method-wrapper `__get__` of `f`>
|
||||
|
||||
# All of these are fine:
|
||||
method_wrapper(C(), C)
|
||||
method_wrapper(C())
|
||||
method_wrapper(C(), None)
|
||||
method_wrapper(None, C)
|
||||
|
||||
# Passing `None` without an `owner` argument is an
|
||||
# error: [missing-argument] "No argument provided for required parameter `owner`"
|
||||
method_wrapper(None)
|
||||
|
||||
# Passing something that is not assignable to `type` as the `owner` argument is an
|
||||
# error: [invalid-argument-type] "Object of type `Literal[1]` cannot be assigned to parameter 2 (`owner`); expected type `type`"
|
||||
method_wrapper(None, 1)
|
||||
|
||||
# Passing `None` as the `owner` argument when `instance` is `None` is an
|
||||
# error: [invalid-argument-type] "Object of type `None` cannot be assigned to parameter 2 (`owner`); expected type `type`"
|
||||
method_wrapper(None, None)
|
||||
|
||||
# Calling `__get__` without any arguments is an
|
||||
# error: [missing-argument] "No argument provided for required parameter `instance`"
|
||||
method_wrapper()
|
||||
|
||||
# Calling `__get__` with too many positional arguments is an
|
||||
# error: [too-many-positional-arguments] "Too many positional arguments: expected 2, got 3"
|
||||
method_wrapper(C(), C, "one too many")
|
||||
```
|
||||
|
||||
[functions and methods]: https://docs.python.org/3/howto/descriptor.html#functions-and-methods
|
||||
Reference in New Issue
Block a user