Cyclic0007/ruff - ruff - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Micha Reiser	f52b1f4a4d	Add tracing support to mdtest (#14935 ) ## Summary This PR extends the mdtest configuration with a `log` setting that can be any of: * `true`: Enables tracing * `false`: Disables tracing (default) * String: An ENV_FILTER similar to `RED_KNOT_LOG` ```toml log = true ``` Closes https://github.com/astral-sh/ruff/issues/13865 ## Test Plan I changed a test and tried `log=true`, `log=false`, and `log=INFO`	2024-12-13 09:10:01 +00:00
Micha Reiser	5fc8e5d80e	[red-knot] Add infrastructure to declare lints (#14873 ) ## Summary This is the second PR out of three that adds support for enabling/disabling lint rules in Red Knot. You may want to take a look at the [first PR](https://github.com/astral-sh/ruff/pull/14869) in this stack to familiarize yourself with the used terminology. This PR adds a new syntax to define a lint: ```rust declare_lint! { /// ## What it does /// Checks for references to names that are not defined. /// /// ## Why is this bad? /// Using an undefined variable will raise a `NameError` at runtime. /// /// ## Example /// /// ```python /// print(x) # NameError: name 'x' is not defined /// ``` pub(crate) static UNRESOLVED_REFERENCE = { summary: "detects references to names that are not defined", status: LintStatus::preview("1.0.0"), default_level: Level::Warn, } } ``` A lint has a name and metadata about its status (preview, stable, removed, deprecated), the default diagnostic level (unless the configuration changes), and documentation. I use a macro here to derive the kebab-case name and extract the documentation automatically. This PR doesn't yet add any mechanism to discover all known lints. This will be added in the next and last PR in this stack. ## Documentation I documented some rules but then decided that it's probably not my best use of time if I document all of them now (it also means that I play catch-up with all of you forever). That's why I left some rules undocumented (marked with TODO) ## Where is the best place to define all lints? I'm not sure. I think what I have in this PR is fine but I also don't love it because most lints are in a single place but not all of them. If you have ideas, let me know. ## Why is the message not part of the lint, unlike Ruff's `Violation` I understand that the main motivation for defining `message` on `Violation` in Ruff is to remove the need to repeat the same message over and over again. I'm not sure if this is an actual problem. Most rules only emit a diagnostic in a single place and they commonly use different messages if they emit diagnostics in different code paths, requiring extra fields on the `Violation` struct. That's why I'm not convinced that there's an actual need for it and there are alternatives that can reduce the repetition when creating a diagnostic: * Create a helper function. We already do this in red knot with the `add_xy` methods * Create a custom `Diagnostic` implementation that tailors the entire diagnostic and pre-codes e.g. the message Avoiding an extra field on the `Violation` also removes the need to allocate intermediate strings as it is commonly the place in Ruff. Instead, Red Knot can use a borrowed string with `format_args` ## Test Plan `cargo test`	2024-12-10 16:14:44 +00:00
David Peter	74309008fd	[red-knot] Property tests (#14178 ) ## Summary This PR adds a new `property_tests` module with quickcheck-based tests that verify certain properties of types. The following properties are currently checked: * `is_equivalent_to`: * is reflexive: `T` is equivalent to itself * `is_subtype_of`: * is reflexive: `T` is a subtype of `T` * is antisymmetric: if `S <: T` and `T <: S`, then `S` is equivalent to `T` * is transitive: `S <: T` & `T <: U` => `S <: U` * `is_disjoint_from`: * is irreflexive: `T` is not disjoint from `T` * is symmetric: `S` disjoint from `T` => `T` disjoint from `S` * `is_assignable_to`: * is reflexive * `negate`: * is an involution: `T.negate().negate()` is equivalent to `T` There are also some tests that validate higher-level properties like: * `S <: T` implies that `S` is not disjoint from `T` * `S <: T` implies that `S` is assignable to `T` * A singleton type must also be single-valued These tests found a few bugs so far: - #14177 - #14195 - #14196 - #14210 - #14731 Some additional notes: - Quickcheck-based property tests are non-deterministic and finding counter-examples might take an arbitrary long time. This makes them bad candidates for running in CI (for every PR). We can think of running them in a cron-job way from time to time, similar to fuzzing. But for now, it's only possible to run them locally (see instructions in source code). - Some tests currently find false positive "counterexamples" because our understanding of equivalence of types is not yet complete. We do not understand that `int \| str` is the same as `str \| int`, for example. These tests are in a separate `property_tests::flaky` module. - Properties can not be formulated in every way possible, due to the fact that `is_disjoint_from` and `is_subtype_of` can produce false negative answers. - The current shrinking implementation is very naive, which leads to counterexamples that are very long (`str & Any & ~tuple[Any] & ~tuple[Unknown] & ~Literal[""] & ~Literal["a"] \| str & int & ~tuple[Any] & ~tuple[Unknown]`), requiring the developer to simplify manually. It has not been a major issue so far, but there is a comment in the code how this can be improved. - The tests are currently implemented using a macro. This is a single commit on top which can easily be reverted, if we prefer the plain code instead. With the macro: ```rs // `S <: T` implies that `S` can be assigned to `T`. type_property_test!( subtype_of_implies_assignable_to, db, forall types s, t. s.is_subtype_of(db, t) => s.is_assignable_to(db, t) ); ``` without the macro: ```rs /// `S <: T` implies that `S` can be assigned to `T`. #[quickcheck] fn subtype_of_implies_assignable_to(s: Ty, t: Ty) -> bool { let db = get_cached_db(); let s = s.into_type(&db); let t = t.into_type(&db); !s.is_subtype_of(&db, t) \|\| s.is_assignable_to(&db, t) } ``` ## Test Plan ```bash while cargo test --release -p red_knot_python_semantic --features property_tests types::property_tests; do :; done ```	2024-12-03 13:54:54 +01:00
Micha Reiser	81e5830585	Workspace discovery (#14308 )	2024-11-15 19:20:15 +01:00
Micha Reiser	2b58705cc1	Remove the optional salsa dependency from the AST crate (#14363 )	2024-11-15 16:46:04 +00:00
Dhruv Manilawala	9ec690b8f8	[red-knot] Add support for string annotations (#14151 ) ## Summary This PR adds support for parsing and inferring types within string annotations. ### Implementation (attempt 1) This is preserved in `6217f48924`. The implementation here would separate the inference of string annotations in the deferred query. This requires the following: * Two ways of evaluating the deferred definitions - lazily and eagerly. * An eager evaluation occurs right outside the definition query which in this case would be in `binding_ty` and `declaration_ty`. * A lazy evaluation occurs on demand like using the `definition_expression_ty` to determine the function return type and class bases. * The above point means that when trying to get the binding type for a variable in an annotated assignment, the definition query won't include the type. So, it'll require going through the deferred query to get the type. This has the following limitations: * Nested string annotations, although not necessarily a useful feature, is difficult to implement unless we convert the implementation in an infinite loop * Partial string annotations require complex layout because inferring the types for stringified and non-stringified parts of the annotation are done in separate queries. This means we need to maintain additional information ### Implementation (attempt 2) This is the final diff in this PR. The implementation here does the complete inference of string annotation in the same definition query by maintaining certain state while trying to infer different parts of an expression and take decisions accordingly. These are: * Allow names that are part of a string annotation to not exists in the symbol table. For example, in `x: "Foo"`, if the "Foo" symbol is not defined then it won't exists in the symbol table even though it's being used. This is an invariant which is being allowed only for symbols in a string annotation. * Similarly, lookup name is updated to do the same and if the symbol doesn't exists, then it's not bounded. * Store the final type of a string annotation on the string expression itself and not for any of the sub-expressions that are created after parsing. This is because those sub-expressions won't exists in the semantic index. Design document: https://www.notion.so/astral-sh/String-Annotations-12148797e1ca801197a9f146641e5b71?pvs=4 Closes: #13796 ## Test Plan * Add various test cases in our markdown framework * Run `red_knot` on LibCST (contains a lot of string annotations, specifically https://github.com/Instagram/LibCST/blob/main/libcst/matchers/_matcher_base.py), FastAPI (good amount of annotated code including `typing.Literal`) and compare against the `main` branch output	2024-11-15 04:10:18 +00:00
Micha Reiser	bc0586d922	Avoid cloning `Name` when looking up function and class types (#14092 )	2024-11-04 15:52:59 +01:00
Alex Waygood	df45a0e3f9	[red-knot] Add MRO resolution for classes (#14027 )	2024-11-04 13:31:38 +00:00
Aditya Pratap Singh	7fd8e30eed	[red-knot] Cleanup generated names of mdtest tests (#13831 ) Co-authored-by: Alex Waygood <alex.waygood@gmail.com> Co-authored-by: Micha Reiser <micha@reiser.io>	2024-10-20 15:11:53 +00:00
David Peter	04b636cba2	[red knot] Use memmem::find instead of custom version (#13750 ) This is a follow-up on #13746: - Use `memmem::find` instead of rolling our own inferior version. - Avoid `x.as_ref()` calls using `&**x`	2024-10-14 15:17:19 +02:00
Carl Meyer	93eff7f174	[red-knot] type inference/checking test framework (#13636 ) ## Summary Adds a markdown-based test framework for writing tests of type inference and type checking. Fixes #11664. Implements the basic required features. A markdown test file is a suite of tests, each test can contain one or more Python files, with optionally specified path/name. The test writes all files to an in-memory file system, runs red-knot, and matches the resulting diagnostics against `Type: ` and `Error: ` assertions embedded in the Python source as comments. We will want to add features like incremental tests, setting custom configuration for tests, writing non-Python files, testing syntax errors, capturing full diagnostic output, etc. There's also plenty of room for improved UX (colored output?). ## Test Plan Lots of tests! Sample of the current output when a test fails: ``` Running tests/inference.rs (target/debug/deps/inference-7c96590aa84de2a4) running 1 test test inference::path_1_resources_inference_numbers_md ... FAILED failures: ---- inference::path_1_resources_inference_numbers_md stdout ---- inference/numbers.md - Numbers - Floats /src/test.py line 2: unexpected error: [invalid-assignment] "Object of type `Literal["str"]` is not assignable to `int`" thread 'inference::path_1_resources_inference_numbers_md' panicked at crates/red_knot_test/src/lib.rs:60:5: Some tests failed. note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace failures: inference::path_1_resources_inference_numbers_md test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.19s error: test failed, to rerun pass `-p red_knot_test --test inference` ``` --------- Co-authored-by: Micha Reiser <micha@reiser.io> Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-10-08 12:33:19 -07:00
Simon	888930b7d3	[red-knot] feat: implement integer comparison (#13571 ) ## Summary Implements the comparison operator for `[Type::IntLiteral]` and `[Type::BooleanLiteral]` (as an artifact of special handling of `True` and `False` in python). Sets the framework to implement more comparison for types known at static time (e.g. `BooleanLiteral`, `StringLiteral`), allowing us to only implement cases of the triplet `<left> Type`, `<right> Type`, `CmpOp`. Contributes to #12701 (without checking off an item yet). ## Test Plan - Added a test for the comparison of literals that should include most cases of note. - Added a test for the comparison of int instances Please note that the cases do not cover 100% of the branches as there are many and the current testing strategy with variables make this fairly confusing once we have too many in one test. --------- Co-authored-by: Carl Meyer <carl@astral.sh> Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-10-04 10:40:59 -07:00
Alex Waygood	82324678cf	Rename the `ruff_vendored` crate to `red_knot_vendored` (#13586 )	2024-10-01 16:16:59 +01:00
Micha Reiser	653c09001a	Use an empty vendored file system in Ruff (#13436 ) ## Summary This PR changes removes the typeshed stubs from the vendored file system shipped with ruff and instead ships an empty "typeshed". Making the typeshed files optional required extracting the typshed files into a new `ruff_vendored` crate. I do like this even if all our builds always include typeshed because it means `red_knot_python_semantic` contains less code that needs compiling. This also allows us to use deflate because the compression algorithm doesn't matter for an archive containing a single, empty file. ## Test Plan `cargo test` I verified with ` cargo tree -f "{p} {f}" -p <package> ` that: * red_knot_wasm: enables `deflate` compression * red_knot: enables `zstd` compression * `ruff`: uses stored I'm not quiet sure how to build the binary that maturin builds but comparing the release artifact size with `strip = true` shows a `1.5MB` size reduction --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2024-09-21 16:31:42 +00:00
Carl Meyer	175d067250	[red-knot] add initial Type::is_equivalent_to and Type::is_assignable_to (#13332 ) These are quite incomplete, but I needed to start stubbing them out in order to build and test declared-types. Allowing unused for now, until they are used later in the declared-types PR. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-09-12 14:15:25 -04:00
Teodoro Freund	b9c8113a8a	Added bytes type and some inference (#13061 ) ## Summary This PR adds the `bytes` type to red-knot: - Added the `bytes` type - Added support for bytes literals - Support for the `+` operator Improves on #12701 Big TODO on supporting and normalizing r-prefixed bytestrings (`rb"hello\n"`) ## Test Plan Added a test for a bytes literals, concatenation, and corner values	2024-08-22 13:27:15 -07:00
Micha Reiser	dce87c21fd	Eagerly validate typeshed versions (#12786 )	2024-08-21 15:49:53 +00:00
Carl Meyer	6359e55383	[red-knot] type narrowing (#12706 ) Extend the `UseDefMap` to also track which constraints (provided by e.g. `if` tests) apply to each visible definition. Uses a custom `BitSet` and `BitSetArray` to track which constraints apply to which definitions, while keeping data inline as much as possible.	2024-08-16 16:34:13 -07:00
Dhruv Manilawala	99dc208b00	[red-knot] Add filename and source location for diagnostics (#12842 ) ## Summary I'm not sure if this is useful but this is a hacky implementation to add the filename and row / column numbers to the current Red Knot diagnostics.	2024-08-12 15:56:30 +00:00
Micha Reiser	a99a45868c	Eagerly validate search paths (#12783 ) Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-08-12 07:46:59 +00:00
Micha Reiser	ffaa35eafe	Add test helper to setup tracing (#12741 )	2024-08-09 07:04:04 +00:00
Carl Meyer	bc5b9b81dd	[red-knot] add dev dependency on ruff_db os feature from red_knot_pyt… (#12760 )	2024-08-08 18:10:30 +01:00
Alex Waygood	f1de08c2a0	[red-knot] Merge the semantic and module-resolver crates (#12751 )	2024-08-08 15:34:11 +01:00
Micha Reiser	eac965ecaf	[red-knot] Watch search paths (#12407 )	2024-07-24 07:38:50 +00:00
Carl Meyer	f22c8ab811	[red-knot] add maybe-undefined lint rule (#12414 ) Add a lint rule to detect if a name is definitely or possibly undefined at a given usage. If I create the file `undef/main.py` with contents: ```python x = int def foo(): z return x if flag: y = x y ``` And then run `cargo run --bin red_knot -- --current-directory ../ruff-examples/undef`, I get the output: ``` Name 'z' used when not defined. Name 'flag' used when not defined. Name 'y' used when possibly not defined. ``` If I modify the file to add `y = 0` at the top, red-knot re-checks it and I get the new output: ``` Name 'z' used when not defined. Name 'flag' used when not defined. ``` Note that `int` is not flagged, since it's a builtin, and `return x` in the function scope is not flagged, since it refers to the global `x`.	2024-07-22 13:53:59 -07:00
Carl Meyer	595b1aa4a1	[red-knot] per-definition inference, use-def maps (#12269 ) Implements definition-level type inference, with basic control flow (only if statements and if expressions so far) in Salsa. There are a couple key ideas here: 1) We can do type inference queries at any of three region granularities: an entire scope, a single definition, or a single expression. These are represented by the `InferenceRegion` enum, and the entry points are the salsa queries `infer_scope_types`, `infer_definition_types`, and `infer_expression_types`. Generally per-scope will be used for scopes that we are directly checking and per-definition will be used anytime we are looking up symbol types from another module/scope. Per-expression should be uncommon: used only for the RHS of an unpacking or multi-target assignment (to avoid re-inferring the RHS once per symbol defined in the assignment) and for test nodes in type narrowing (e.g. the `test` of an `If` node). All three queries return a `TypeInference` with a map of types for all definitions and expressions within their region. If you do e.g. scope-level inference, when it hits a definition, or an independently-inferable expression, it should use the relevant query (which may already be cached) to get all types within the smaller region. This avoids double-inferring smaller regions, even though larger regions encompass smaller ones. 2) Instead of building a control-flow graph and lazily traversing it to find definitions which reach a use of a name (which is O(n^2) in the worst case), instead semantic indexing builds a use-def map, where every use of a name knows which definitions can reach that use. We also no longer track all definitions of a symbol in the symbol itself; instead the use-def map also records which defs remain visible at the end of the scope, and considers these the publicly-visible definitions of the symbol (see below). Major items left as TODOs in this PR, to be done in follow-up PRs: 1) Free/global references aren't supported yet (only lookup based on definitions in current scope), which means the override-check example doesn't currently work. This is the first thing I'll fix as follow-up to this PR. 2) Control flow outside of if statements and expressions. 3) Type narrowing. There are also some smaller relevant changes here: 1) Eliminate `Option` in the return type of member lookups; instead always return `Type::Unbound` for a name we can't find. Also use `Type::Unbound` for modules we can't resolve (not 100% sure about this one yet.) 2) Eliminate the use of the terms "public" and "root" to refer to module-global scope or symbols. Instead consistently use the term "module-global". It's longer, but it's the clearest, and the most consistent with typical Python terminology. In particular I don't like "public" for this use because it has other implications around author intent (is an underscore-prefixed module-global symbol "public"?). And "root" is just not commonly used for this in Python. 3) Eliminate the `PublicSymbol` Salsa ingredient. Many non-module-global symbols can also be seen from other scopes (e.g. by a free var in a nested scope, or by class attribute access), and thus need to have a "public type" (that is, the type not as seen from a particular use in the control flow of the same scope, but the type as seen from some other scope.) So all symbols need to have a "public type" (here I want to keep the use of the term "public", unless someone has a better term to suggest -- since it's "public type of a symbol" and not "public symbol" the confusion with e.g. initial underscores is less of an issue.) At least initially, I would like to try not having special handling for module-global symbols vs other symbols. 4) Switch to using "definitions that reach end of scope" rather than "all definitions" in determining the public type of a symbol. I'm convinced that in general this is the right way to go. We may want to refine this further in future for some free-variable cases, but it can be changed purely by making changes to the building of the use-def map (the `public_definitions` index in it), without affecting any other code. One consequence of combining this with no control-flow support (just last-definition-wins) is that some inference tests now give more wrong-looking results; I left TODO comments on these tests to fix them when control flow is added. And some potential areas for consideration in the future: 1) Should `symbol_ty` be a Salsa query? This would require making all symbols a Salsa ingredient, and tracking even more dependencies. But it would save some repeated reconstruction of unions, for symbols with multiple public definitions. For now I'm not making it a query, but open to changing this in future with actual perf evidence that it's better.	2024-07-16 11:02:30 -07:00
Carl Meyer	0e44235981	[red-knot] intern types using Salsa (#12061 ) Intern types using Salsa interning instead of in the `TypeInference` result. This eliminates the need for `TypingContext`, and also paves the way for finer-grained type inference queries.	2024-07-05 12:16:37 -07:00
Micha Reiser	3ce8b9fcae	Make `Definition` a salsa-ingredient (#12151 )	2024-07-04 06:46:08 +00:00
Micha Reiser	5109b50bb3	Use `CompactString` for `Identifier` (#12101 )	2024-07-01 10:06:02 +02:00
Alex Waygood	736a4ead14	[red-knot] Move module-resolution logic to its own crate (#11964 )	2024-06-21 13:25:44 +00:00
Micha Reiser	2dfbf118d7	[red-knot] Extract `red_knot_python_semantic` crate (#11926 )	2024-06-20 13:24:24 +02:00

31 Commits