Commit Graph

39 Commits

Author SHA1 Message Date
Dhruv Manilawala
230c9ce236 Split Constant to individual literal nodes (#8064)
## Summary

This PR splits the `Constant` enum as individual literal nodes. It
introduces the following new nodes for each variant:
* `ExprStringLiteral`
* `ExprBytesLiteral`
* `ExprNumberLiteral`
* `ExprBooleanLiteral`
* `ExprNoneLiteral`
* `ExprEllipsisLiteral`

The main motivation behind this refactor is to introduce the new AST
node for implicit string concatenation in the coming PR. The elements of
that node will be either a string literal, bytes literal or a f-string
which can be implemented using an enum. This means that a string or
bytes literal cannot be represented by `Constant::Str` /
`Constant::Bytes` which creates an inconsistency.

This PR avoids that inconsistency by splitting the constant nodes into
it's own literal nodes, literal being the more appropriate naming
convention from a static analysis tool perspective.

This also makes working with literals in the linter and formatter much
more ergonomic like, for example, if one would want to check if this is
a string literal, it can be done easily using
`Expr::is_string_literal_expr` or matching against `Expr::StringLiteral`
as oppose to matching against the `ExprConstant` and enum `Constant`. A
few AST helper methods can be simplified as well which will be done in a
follow-up PR.

This introduces a new `Expr::is_literal_expr` method which is the same
as `Expr::is_constant_expr`. There are also intermediary changes related
to implicit string concatenation which are quiet less. This is done so
as to avoid having a huge PR which this already is.

## Test Plan

1. Verify and update all of the existing snapshots (parser, visitor)
2. Verify that the ecosystem check output remains **unchanged** for both
the linter and formatter

### Formatter ecosystem check

#### `main`

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |

#### `dhruv/constant-to-literal`

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
2023-10-30 12:13:23 +05:30
Charlie Marsh
d685107638 Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030)
This is a do-over of https://github.com/astral-sh/ruff/pull/8011, which
I accidentally merged into a non-`main` branch. Sorry!
2023-10-18 00:01:18 +00:00
konsti
644011fb14 Formatter quoting for f-strings with triple quotes (#7826)
**Summary** Quoting of f-strings can change if they are triple quoted
and only contain single quotes inside.

Fixes #6841

**Test Plan** New fixtures

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2023-10-11 11:30:34 +00:00
Dhruv Manilawala
e62e245c61 Add support for PEP 701 (#7376)
## Summary

This PR adds support for PEP 701 in Ruff. This is a rollup PR of all the
other individual PRs. The separate PRs were created for logic separation
and code reviews. Refer to each pull request for a detail description on
the change.

Refer to the PR description for the list of pull requests within this PR.

## Test Plan

### Formatter ecosystem checks

Explanation for the change in ecosystem check:
https://github.com/astral-sh/ruff/pull/7597#issue-1908878183

#### `main`

```
| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76083 |              1789 |              1631 |
| django       |           0.99983 |              2760 |                36 |
| transformers |           0.99963 |              2587 |               319 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99983 |              3496 |                18 |
| warehouse    |           0.99967 |               648 |                15 |
| zulip        |           0.99972 |              1437 |                21 |
```

#### `dhruv/pep-701`

```
| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76051 |              1789 |              1632 |
| django       |           0.99983 |              2760 |                36 |
| transformers |           0.99963 |              2587 |               319 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99983 |              3496 |                18 |
| warehouse    |           0.99967 |               648 |                15 |
| zulip        |           0.99972 |              1437 |                21 |
```
2023-09-29 02:55:39 +00:00
Charlie Marsh
695dbbc539 Always prefer double quotes for docstrings and triple-quoted srings (#7680)
## Summary

At present, `quote-style` is used universally. However, [PEP
8](https://peps.python.org/pep-0008/) and [PEP
257](https://peps.python.org/pep-0257/) suggest that while either single
or double quotes are acceptable in general (as long as they're
consistent), docstrings and triple-quoted strings should always use
double quotes. In our research, the vast majority of Ruff users that
enable the `flake8-quotes` rules only enable them for inline strings
(i.e., non-triple-quoted strings).

Additionally, many Black forks (like Blue and Pyink) use double quotes
for docstrings and triple-quoted strings.

Our decision for now is to always prefer double quotes for triple-quoted
strings (which should include docstrings). Based on feedback, we may
consider adding additional options (e.g., a `"preserve"` mode, to avoid
changing quotes; or a `"multiline-quote-style"` to override this).

Closes https://github.com/astral-sh/ruff/issues/7615.

## Test Plan

`cargo test`
2023-09-28 15:11:33 -04:00
konsti
c4d85d6fb6 Fix ''' ""''' formatting (#7485)
## Summary

`''' ""'''` is an edge case that was previously incorrectly formatted as
`""" """""`.

Fixes #7460

## Test Plan

Added regression test
2023-09-18 10:28:15 +00:00
Micha Reiser
2d9b39871f Introduce IndentWidth (#7301) 2023-09-13 14:52:24 +02:00
Micha Reiser
0a07a2ca62 Extract string part and normalized string (#7219) 2023-09-08 12:56:55 +02:00
Micha Reiser
41f0aad7b3 Add FString support to binary like formatting
## Summary

This is the last part of the string - binary like formatting. It adds support for handling fstrings the same as "regular" strings.


## Test Plan

I added a test for both binary and comparison. 

Small improvements across several projects

**This PR**
| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76083 |              1789 |              1632 |
| django       |           0.99966 |              2760 |                58 |
| **transformers** |           0.99929 |              2587 |               454 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99978 |              3496 |              2173 |
| **warehouse**    |           0.99825 |               648 |                22 |
| **zulip**        |           0.99950 |              1437 |                27 |


**Base**

| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76083 |              1789 |              1632 |
| django       |           0.99966 |              2760 |                58 |
| transformers |           0.99928 |              2587 |               454 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99978 |              3496 |              2173 |
| warehouse    |           0.99824 |               648 |                22 |
| zulip        |           0.99948 |              1437 |                28 |


<!-- How was it tested? -->
2023-09-08 11:48:57 +02:00
Micha Reiser
e376c3ff7e Split implicit concatenated strings before binary expressions (#7145) 2023-09-08 06:51:26 +00:00
Micha Reiser
5f59101811 Memoize text width (#6552) 2023-09-06 07:10:13 +00:00
Micha Reiser
c05e4628b1 Introduce Token element (#7048) 2023-09-02 10:05:47 +02:00
Charlie Marsh
b404e54f33 Remove unnecessary Comment#slice calls (#6997) 2023-08-30 00:44:11 +00:00
Charlie Marsh
aea7500c1e Allow Locator#slice to take Ranged (#6922)
## Summary

As a small quality-of-life improvement, the locator can now slice like
`locator.slice(stmt)` instead of requiring
`locator.slice(stmt.range())`.

## Test Plan

`cargo test`
2023-08-28 11:08:39 -04:00
Charlie Marsh
fc89976c24 Move Ranged into ruff_text_size (#6919)
## Summary

The motivation here is that this enables us to implement `Ranged` in
crates that don't depend on `ruff_python_ast`.

Largely a mechanical refactor with a lot of regex, Clippy help, and
manual fixups.

## Test Plan

`cargo test`
2023-08-27 14:12:51 -04:00
Micha Reiser
9d77552e18 Add tab width option (#6848) 2023-08-26 12:29:58 +02:00
David Szotten
1c66bb80b7 fix is_raw_string for multiple prefixes (#6865)
fix `is_raw_string` in the presence of other prefixes (like `rb"foo"`)

fixes #6864
2023-08-25 09:58:26 +02:00
Micha Reiser
0cea4975fc Rename Comments methods (#6649) 2023-08-18 06:37:01 +00:00
Charlie Marsh
db1c556508 Implement Ranged on more structs (#6639)
## Summary

I noticed some inconsistencies around uses of `.range.start()`, structs
that have a `TextRange` field but don't implement `Ranged`, etc.

## Test Plan

`cargo test`
2023-08-17 11:22:39 -04:00
Charlie Marsh
a7cf8f0b77 Replace dynamic implicit concatenation detection with parser flag (#6513)
## Summary

In https://github.com/astral-sh/ruff/pull/6512, we added a flag to the
AST to mark implicitly-concatenated string expressions. This PR makes
use of that flag to remove the `is_implicit_concatenation` method.

## Test Plan

`cargo test`
2023-08-14 10:27:17 -04:00
konsti
01eceaf0dc Format docstrings (#6452)
**Summary** Implement docstring formatting

**Test Plan** Matches black's `docstring.py` fixture exactly, added some
new cases for what is hard to debug with black and with what black
doesn't cover.

similarity index:

main:
zulip: 0.99702
django: 0.99784
warehouse: 0.99585
build: 0.75623
transformers: 0.99469
cpython: 0.75989
typeshed: 0.74853

this branch:

zulip: 0.99702
django: 0.99784
warehouse: 0.99585
build: 0.75623
transformers: 0.99464
cpython: 0.75517
typeshed: 0.74853

The regression in transformers is actually an improvement in a file they
don't format with black (they run `black examples tests src utils
setup.py conftest.py`, the difference is in hubconf.py). cpython doesn't
use black.

Closes #6196
2023-08-14 12:28:58 +00:00
Charlie Marsh
3f0eea6d87 Rename JoinedStr to FString in the AST (#6379)
## Summary

Per the proposal in https://github.com/astral-sh/ruff/discussions/6183,
this PR renames the `JoinedStr` node to `FString`.
2023-08-07 17:33:17 +00:00
konsti
a48d16e025 Replace Formatter<PyFormatContext<'_>> with PyFormatter (#6330)
This is a refactoring to use the type alias in more places. In the
process, I had to fix and run generate.py. There are no functional
changes.
2023-08-04 10:48:58 +02:00
David Szotten
07468f8be9 format ExprJoinedStr (#5932) 2023-08-01 08:26:30 +02:00
konsti
9063f4524d Fix formatting of trailing unescaped quotes in raw triple quoted strings (#6202)
**Summary** This prevents us from turning `r'''\""'''` into
`r"""\"""""`, which is invalid syntax.

This PR fixes CI, which is currently broken on main (in a way that still
passes on linter PRs and allows merging formatter PRs, but it's bad to
have a job be red). Once merged, i'll make the formatted ecosystem
checks a required check.

**Test Plan** Added a regression test.
2023-07-31 19:25:16 +02:00
Harutaka Kawamura
0274de1fff Preserve backslash in raw string literal (#6152) 2023-07-31 12:48:17 +00:00
Luc Khai Hai
b95fc6d162 Format bytes string (#6166)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Format bytes string

Closes #6064

## Test Plan

Added a fixture based on string's one
2023-07-31 10:46:40 +02:00
Micha Reiser
40f54375cb Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser
2cf00fee96 Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
Micha Reiser
fdb3c8852f Prefer breaking the implicit string concatenation over breaking before % (#5947) 2023-07-24 18:30:42 +02:00
konsti
63ed7a31e8 Add message to formatter SyntaxError (#5881)
**Summary** Add a static string error message to the formatter syntax
error so we can disambiguate where the syntax error came from

**Test Plan** No fixed tests, we don't expect this to occur, but it
helped with transformers syntax error debugging:

```
Error: Failed to format node

Caused by:
    syntax error: slice first colon token was not a colon
```
2023-07-19 17:15:26 +02:00
Micha Reiser
8187bf9f7e Cover Black's is_aritmetic_like formatting (#5738) 2023-07-14 17:54:58 +02:00
Micha Reiser
067b2a6ce6 Pass parent to NeedsParentheses (#5708) 2023-07-13 08:57:29 +02:00
Micha Reiser
f1d367655b Format target: annotation = value? expressions (#5661) 2023-07-11 16:40:28 +02:00
Micha Reiser
f9129e435a Normalize '\r' in string literals to '\n'
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR normalizes line endings inside of strings to `\n` as required by the printer.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

I added a new test using `\r\n` and ran the ecosystem check. There are no remaining end of line panics. 


https://gist.github.com/MichaReiser/8f36b1391ca7b48475b3a4f592d74ff4

<!-- How was it tested? -->
2023-06-30 10:13:23 +02:00
Micha Reiser
38189ed913 Fix invalid printer IR error (#5422) 2023-06-29 08:09:13 +02:00
Micha Reiser
49cabca3e7 Format implicit string continuation (#5328) 2023-06-26 12:41:47 +00:00
Micha Reiser
313711aaf9 Prefer the configured quote style
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR extends the string formatting to respect the configured quote style.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

Extended the string test with new cases and set it up to run twice: Once with the `quote_style: Doube`, and once with `quote_style: Single` single and double quotes. 

<!-- How was it tested? -->
2023-06-26 14:24:25 +02:00
Micha Reiser
c52aa8f065 Basic string formatting
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR implements formatting for non-f-string Strings that do not use implicit concatenation. 

Docstring formatting is out of the scope of this PR.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

I added a few tests for simple string literals. 

## Performance

Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times:

1. To detect if it is an implicit string continuation
2. To detect if the string contains any new lines
3. To detect the preferred quote
4. To normalize the string

Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time.
We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.
2023-06-23 09:46:05 +02:00