[red-knot] Method calls and the descriptor protocol (#16121)
## Summary
This PR achieves the following:
* Add support for checking method calls, and inferring return types from
method calls. For example:
```py
reveal_type("abcde".find("abc")) # revealed: int
reveal_type("foo".encode(encoding="utf-8")) # revealed: bytes
"abcde".find(123) # error: [invalid-argument-type]
class C:
def f(self) -> int:
pass
reveal_type(C.f) # revealed: <function `f`>
reveal_type(C().f) # revealed: <bound method: `f` of `C`>
C.f() # error: [missing-argument]
reveal_type(C().f()) # revealed: int
```
* Implement the descriptor protocol, i.e. properly call the `__get__`
method when a descriptor object is accessed through a class object or an
instance of a class. For example:
```py
from typing import Literal
class Ten:
def __get__(self, instance: object, owner: type | None = None) ->
Literal[10]:
return 10
class C:
ten: Ten = Ten()
reveal_type(C.ten) # revealed: Literal[10]
reveal_type(C().ten) # revealed: Literal[10]
```
* Add support for member lookup on intersection types.
* Support type inference for `inspect.getattr_static(obj, attr)` calls.
This was mostly used as a debugging tool during development, but seems
more generally useful. It can be used to bypass the descriptor protocol.
For the example above:
```py
from inspect import getattr_static
reveal_type(getattr_static(C, "ten")) # revealed: Ten
```
* Add a new `Type::Callable(…)` variant with the following sub-variants:
* `Type::Callable(CallableType::BoundMethod(…))` — represents bound
method objects, e.g. `C().f` above
* `Type::Callable(CallableType::MethodWrapperDunderGet(…))` — represents
`f.__get__` where `f` is a function
* `Type::Callable(WrapperDescriptorDunderGet)` — represents
`FunctionType.__get__`
* Add new known classes:
* `types.MethodType`
* `types.MethodWrapperType`
* `types.WrapperDescriptorType`
* `builtins.range`
## Performance analysis
On this branch, we do more work. We need to do more call checking, since
we now check all method calls. We also need to do ~twice as many member
lookups, because we need to check if a `__get__` attribute exists on
accessed members.
A brief analysis on `tomllib` shows that we now call `Type::call` 1780
times, compared to 612 calls before.
## Limitations
* Data descriptors are not yet supported, i.e. we do not infer correct
types for descriptor attribute accesses in `Store` context and do not
check writes to descriptor attributes. I felt like this was something
that could be split out as a follow-up without risking a major
architectural change.
* We currently distinguish between `Type::member` (with descriptor
protocol) and `Type::static_member` (without descriptor protocol). The
former corresponds to `obj.attr`, the latter corresponds to
`getattr_static(obj, "attr")`. However, to model some details correctly,
we would also need to distinguish between a static member lookup *with*
and *without* instance variables. The lookup without instance variables
corresponds to `find_name_in_mro`
[here](https://docs.python.org/3/howto/descriptor.html#invocation-from-an-instance).
We currently approximate both using `member_static`, which leads to two
open TODOs. Changing this would be a larger refactoring of
`Type::own_instance_member`, so I chose to leave it out of this PR.
## Test Plan
* New `call/methods.md` test suite for method calls
* New tests in `descriptor_protocol.md`
* New `call/getattr_static.md` test suite for `inspect.getattr_static`
* Various updated tests
This commit is contained in:
@@ -0,0 +1,133 @@
|
||||
# `inspect.getattr_static`
|
||||
|
||||
## Basic usage
|
||||
|
||||
`inspect.getattr_static` is a function that returns attributes of an object without invoking the
|
||||
descriptor protocol (for caveats, see the [official documentation]).
|
||||
|
||||
Consider the following example:
|
||||
|
||||
```py
|
||||
import inspect
|
||||
|
||||
class Descriptor:
|
||||
def __get__(self, instance, owner) -> str:
|
||||
return 1
|
||||
|
||||
class C:
|
||||
normal: int = 1
|
||||
descriptor: Descriptor = Descriptor()
|
||||
```
|
||||
|
||||
If we access attributes on an instance of `C` as usual, the descriptor protocol is invoked, and we
|
||||
get a type of `str` for the `descriptor` attribute:
|
||||
|
||||
```py
|
||||
c = C()
|
||||
|
||||
reveal_type(c.normal) # revealed: int
|
||||
reveal_type(c.descriptor) # revealed: str
|
||||
```
|
||||
|
||||
However, if we use `inspect.getattr_static`, we can see the underlying `Descriptor` type:
|
||||
|
||||
```py
|
||||
reveal_type(inspect.getattr_static(c, "normal")) # revealed: int
|
||||
reveal_type(inspect.getattr_static(c, "descriptor")) # revealed: Descriptor
|
||||
```
|
||||
|
||||
For non-existent attributes, a default value can be provided:
|
||||
|
||||
```py
|
||||
reveal_type(inspect.getattr_static(C, "normal", "default-arg")) # revealed: int
|
||||
reveal_type(inspect.getattr_static(C, "non_existent", "default-arg")) # revealed: Literal["default-arg"]
|
||||
```
|
||||
|
||||
When a non-existent attribute is accessed without a default value, the runtime raises an
|
||||
`AttributeError`. We could emit a diagnostic for this case, but that is currently not supported:
|
||||
|
||||
```py
|
||||
# TODO: we could emit a diagnostic here
|
||||
reveal_type(inspect.getattr_static(C, "non_existent")) # revealed: Never
|
||||
```
|
||||
|
||||
We can access attributes on objects of all kinds:
|
||||
|
||||
```py
|
||||
import sys
|
||||
|
||||
reveal_type(inspect.getattr_static(sys, "platform")) # revealed: LiteralString
|
||||
reveal_type(inspect.getattr_static(inspect, "getattr_static")) # revealed: Literal[getattr_static]
|
||||
|
||||
reveal_type(inspect.getattr_static(1, "real")) # revealed: Literal[1]
|
||||
```
|
||||
|
||||
(Implicit) instance attributes can also be accessed through `inspect.getattr_static`:
|
||||
|
||||
```py
|
||||
class D:
|
||||
def __init__(self) -> None:
|
||||
self.instance_attr: int = 1
|
||||
|
||||
reveal_type(inspect.getattr_static(D(), "instance_attr")) # revealed: int
|
||||
```
|
||||
|
||||
## Error cases
|
||||
|
||||
We can only infer precise types if the attribute is a literal string. In all other cases, we fall
|
||||
back to `Any`:
|
||||
|
||||
```py
|
||||
import inspect
|
||||
|
||||
class C:
|
||||
x: int = 1
|
||||
|
||||
def _(attr_name: str):
|
||||
reveal_type(inspect.getattr_static(C(), attr_name)) # revealed: Any
|
||||
reveal_type(inspect.getattr_static(C(), attr_name, 1)) # revealed: Any
|
||||
```
|
||||
|
||||
But we still detect errors in the number or type of arguments:
|
||||
|
||||
```py
|
||||
# error: [missing-argument] "No arguments provided for required parameters `obj`, `attr` of function `getattr_static`"
|
||||
inspect.getattr_static()
|
||||
|
||||
# error: [missing-argument] "No argument provided for required parameter `attr`"
|
||||
inspect.getattr_static(C())
|
||||
|
||||
# error: [invalid-argument-type] "Object of type `Literal[1]` cannot be assigned to parameter 2 (`attr`) of function `getattr_static`; expected type `str`"
|
||||
inspect.getattr_static(C(), 1)
|
||||
|
||||
# error: [too-many-positional-arguments] "Too many positional arguments to function `getattr_static`: expected 3, got 4"
|
||||
inspect.getattr_static(C(), "x", "default-arg", "one too many")
|
||||
```
|
||||
|
||||
## Possibly unbound attributes
|
||||
|
||||
```py
|
||||
import inspect
|
||||
|
||||
def _(flag: bool):
|
||||
class C:
|
||||
if flag:
|
||||
x: int = 1
|
||||
|
||||
reveal_type(inspect.getattr_static(C, "x", "default")) # revealed: int | Literal["default"]
|
||||
```
|
||||
|
||||
## Gradual types
|
||||
|
||||
```py
|
||||
import inspect
|
||||
from typing import Any
|
||||
|
||||
def _(a: Any, tuple_of_any: tuple[Any]):
|
||||
reveal_type(inspect.getattr_static(a, "x", "default")) # revealed: Any | Literal["default"]
|
||||
|
||||
# TODO: Ideally, this would just be `Literal[index]`
|
||||
reveal_type(inspect.getattr_static(tuple_of_any, "index", "default")) # revealed: Literal[index] | Literal["default"]
|
||||
```
|
||||
|
||||
[official documentation]: https://docs.python.org/3/library/inspect.html#inspect.getattr_static
|
||||
258
crates/red_knot_python_semantic/resources/mdtest/call/methods.md
Normal file
258
crates/red_knot_python_semantic/resources/mdtest/call/methods.md
Normal file
@@ -0,0 +1,258 @@
|
||||
# Methods
|
||||
|
||||
## Background: Functions as descriptors
|
||||
|
||||
> Note: See also this related section in the descriptor guide: [Functions and methods].
|
||||
|
||||
Say we have a simple class `C` with a function definition `f` inside its body:
|
||||
|
||||
```py
|
||||
class C:
|
||||
def f(self, x: int) -> str:
|
||||
return "a"
|
||||
```
|
||||
|
||||
Whenever we access the `f` attribute through the class object itself (`C.f`) or through an instance
|
||||
(`C().f`), this access happens via the descriptor protocol. Functions are (non-data) descriptors
|
||||
because they implement a `__get__` method. This is crucial in making sure that method calls work as
|
||||
expected. In general, the signature of the `__get__` method in the descriptor protocol is
|
||||
`__get__(self, instance, owner)`. The `self` argument is the descriptor object itself (`f`). The
|
||||
passed value for the `instance` argument depends on whether the attribute is accessed from the class
|
||||
object (in which case it is `None`), or from an instance (in which case it is the instance of type
|
||||
`C`). The `owner` argument is the class itself (`C` of type `Literal[C]`). To summarize:
|
||||
|
||||
- `C.f` is equivalent to `getattr_static(C, "f").__get__(None, C)`
|
||||
- `C().f` is equivalent to `getattr_static(C, "f").__get__(C(), C)`
|
||||
|
||||
Here, `inspect.getattr_static` is used to bypass the descriptor protocol and directly access the
|
||||
function attribute. The way the special `__get__` method *on functions* works is as follows. In the
|
||||
former case, if the `instance` argument is `None`, `__get__` simply returns the function itself. In
|
||||
the latter case, it returns a *bound method* object:
|
||||
|
||||
```py
|
||||
from inspect import getattr_static
|
||||
|
||||
reveal_type(getattr_static(C, "f")) # revealed: Literal[f]
|
||||
|
||||
reveal_type(getattr_static(C, "f").__get__) # revealed: <method-wrapper `__get__` of `f`>
|
||||
|
||||
reveal_type(getattr_static(C, "f").__get__(None, C)) # revealed: Literal[f]
|
||||
reveal_type(getattr_static(C, "f").__get__(C(), C)) # revealed: <bound method `f` of `C`>
|
||||
```
|
||||
|
||||
In conclusion, this is why we see the following two types when accessing the `f` attribute on the
|
||||
class object `C` and on an instance `C()`:
|
||||
|
||||
```py
|
||||
reveal_type(C.f) # revealed: Literal[f]
|
||||
reveal_type(C().f) # revealed: <bound method `f` of `C`>
|
||||
```
|
||||
|
||||
A bound method is a callable object that contains a reference to the `instance` that it was called
|
||||
on (can be inspected via `__self__`), and the function object that it refers to (can be inspected
|
||||
via `__func__`):
|
||||
|
||||
```py
|
||||
bound_method = C().f
|
||||
|
||||
reveal_type(bound_method.__self__) # revealed: C
|
||||
reveal_type(bound_method.__func__) # revealed: Literal[f]
|
||||
```
|
||||
|
||||
When we call the bound method, the `instance` is implicitly passed as the first argument (`self`):
|
||||
|
||||
```py
|
||||
reveal_type(C().f(1)) # revealed: str
|
||||
reveal_type(bound_method(1)) # revealed: str
|
||||
```
|
||||
|
||||
When we call the function object itself, we need to pass the `instance` explicitly:
|
||||
|
||||
```py
|
||||
C.f(1) # error: [missing-argument]
|
||||
|
||||
reveal_type(C.f(C(), 1)) # revealed: str
|
||||
```
|
||||
|
||||
When we access methods from derived classes, they will be bound to instances of the derived class:
|
||||
|
||||
```py
|
||||
class D(C):
|
||||
pass
|
||||
|
||||
reveal_type(D().f) # revealed: <bound method `f` of `D`>
|
||||
```
|
||||
|
||||
If we access an attribute on a bound method object itself, it will defer to `types.MethodType`:
|
||||
|
||||
```py
|
||||
reveal_type(bound_method.__hash__) # revealed: <bound method `__hash__` of `MethodType`>
|
||||
```
|
||||
|
||||
If an attribute is not available on the bound method object, it will be looked up on the underlying
|
||||
function object. We model this explicitly, which means that we can access `__kwdefaults__` on bound
|
||||
methods, even though it is not available on `types.MethodType`:
|
||||
|
||||
```py
|
||||
reveal_type(bound_method.__kwdefaults__) # revealed: @Todo(generics) | None
|
||||
```
|
||||
|
||||
## Basic method calls on class objects and instances
|
||||
|
||||
```py
|
||||
class Base:
|
||||
def method_on_base(self, x: int | None) -> str:
|
||||
return "a"
|
||||
|
||||
class Derived(Base):
|
||||
def method_on_derived(self, x: bytes) -> tuple[int, str]:
|
||||
return (1, "a")
|
||||
|
||||
reveal_type(Base().method_on_base(1)) # revealed: str
|
||||
reveal_type(Base.method_on_base(Base(), 1)) # revealed: str
|
||||
|
||||
Base().method_on_base("incorrect") # error: [invalid-argument-type]
|
||||
Base().method_on_base() # error: [missing-argument]
|
||||
Base().method_on_base(1, 2) # error: [too-many-positional-arguments]
|
||||
|
||||
reveal_type(Derived().method_on_base(1)) # revealed: str
|
||||
reveal_type(Derived().method_on_derived(b"abc")) # revealed: tuple[int, str]
|
||||
reveal_type(Derived.method_on_base(Derived(), 1)) # revealed: str
|
||||
reveal_type(Derived.method_on_derived(Derived(), b"abc")) # revealed: tuple[int, str]
|
||||
```
|
||||
|
||||
## Method calls on literals
|
||||
|
||||
### Boolean literals
|
||||
|
||||
```py
|
||||
reveal_type(True.bit_length()) # revealed: int
|
||||
reveal_type(True.as_integer_ratio()) # revealed: tuple[int, Literal[1]]
|
||||
```
|
||||
|
||||
### Integer literals
|
||||
|
||||
```py
|
||||
reveal_type((42).bit_length()) # revealed: int
|
||||
```
|
||||
|
||||
### String literals
|
||||
|
||||
```py
|
||||
reveal_type("abcde".find("abc")) # revealed: int
|
||||
reveal_type("foo".encode(encoding="utf-8")) # revealed: bytes
|
||||
|
||||
"abcde".find(123) # error: [invalid-argument-type]
|
||||
```
|
||||
|
||||
### Bytes literals
|
||||
|
||||
```py
|
||||
reveal_type(b"abcde".startswith(b"abc")) # revealed: bool
|
||||
```
|
||||
|
||||
## Method calls on `LiteralString`
|
||||
|
||||
```py
|
||||
from typing_extensions import LiteralString
|
||||
|
||||
def f(s: LiteralString) -> None:
|
||||
reveal_type(s.find("a")) # revealed: int
|
||||
```
|
||||
|
||||
## Method calls on `tuple`
|
||||
|
||||
```py
|
||||
def f(t: tuple[int, str]) -> None:
|
||||
reveal_type(t.index("a")) # revealed: int
|
||||
```
|
||||
|
||||
## Method calls on unions
|
||||
|
||||
```py
|
||||
from typing import Any
|
||||
|
||||
class A:
|
||||
def f(self) -> int:
|
||||
return 1
|
||||
|
||||
class B:
|
||||
def f(self) -> str:
|
||||
return "a"
|
||||
|
||||
def f(a_or_b: A | B, any_or_a: Any | A):
|
||||
reveal_type(a_or_b.f) # revealed: <bound method `f` of `A`> | <bound method `f` of `B`>
|
||||
reveal_type(a_or_b.f()) # revealed: int | str
|
||||
|
||||
reveal_type(any_or_a.f) # revealed: Any | <bound method `f` of `A`>
|
||||
reveal_type(any_or_a.f()) # revealed: Any | int
|
||||
```
|
||||
|
||||
## Method calls on `KnownInstance` types
|
||||
|
||||
```toml
|
||||
[environment]
|
||||
python-version = "3.12"
|
||||
```
|
||||
|
||||
```py
|
||||
type IntOrStr = int | str
|
||||
|
||||
reveal_type(IntOrStr.__or__) # revealed: <bound method `__or__` of `typing.TypeAliasType`>
|
||||
```
|
||||
|
||||
## Error cases: Calling `__get__` for methods
|
||||
|
||||
The `__get__` method on `types.FunctionType` has the following overloaded signature in typeshed:
|
||||
|
||||
```py
|
||||
from types import FunctionType, MethodType
|
||||
from typing import overload
|
||||
|
||||
@overload
|
||||
def __get__(self, instance: None, owner: type, /) -> FunctionType: ...
|
||||
@overload
|
||||
def __get__(self, instance: object, owner: type | None = None, /) -> MethodType: ...
|
||||
```
|
||||
|
||||
Here, we test that this signature is enforced correctly:
|
||||
|
||||
```py
|
||||
from inspect import getattr_static
|
||||
|
||||
class C:
|
||||
def f(self, x: int) -> str:
|
||||
return "a"
|
||||
|
||||
method_wrapper = getattr_static(C, "f").__get__
|
||||
|
||||
reveal_type(method_wrapper) # revealed: <method-wrapper `__get__` of `f`>
|
||||
|
||||
# All of these are fine:
|
||||
method_wrapper(C(), C)
|
||||
method_wrapper(C())
|
||||
method_wrapper(C(), None)
|
||||
method_wrapper(None, C)
|
||||
|
||||
# Passing `None` without an `owner` argument is an
|
||||
# error: [missing-argument] "No argument provided for required parameter `owner`"
|
||||
method_wrapper(None)
|
||||
|
||||
# Passing something that is not assignable to `type` as the `owner` argument is an
|
||||
# error: [invalid-argument-type] "Object of type `Literal[1]` cannot be assigned to parameter 2 (`owner`); expected type `type`"
|
||||
method_wrapper(None, 1)
|
||||
|
||||
# Passing `None` as the `owner` argument when `instance` is `None` is an
|
||||
# error: [invalid-argument-type] "Object of type `None` cannot be assigned to parameter 2 (`owner`); expected type `type`"
|
||||
method_wrapper(None, None)
|
||||
|
||||
# Calling `__get__` without any arguments is an
|
||||
# error: [missing-argument] "No argument provided for required parameter `instance`"
|
||||
method_wrapper()
|
||||
|
||||
# Calling `__get__` with too many positional arguments is an
|
||||
# error: [too-many-positional-arguments] "Too many positional arguments: expected 2, got 3"
|
||||
method_wrapper(C(), C, "one too many")
|
||||
```
|
||||
|
||||
[functions and methods]: https://docs.python.org/3/howto/descriptor.html#functions-and-methods
|
||||
Reference in New Issue
Block a user