1

I understand that the slicing syntax contains 3 arguments, namely:

  1. start

  2. stop

  3. step

With the following default values:

  1. start = 0

  2. stop = length of string

  3. step = 1

So, for

> string = "abc"

> string[::]

will return

> "abc"

But, for

> string[::-1]

shouldn't it return:

> "a"

Since, string[start = 0] = 'a' Then, string[start + step], i.e. string[0-1] = 'c', but because its not less than stop = 3, it will break.

OR am i thinking in the wrong direction, and python simply slices the string in the usual direction and returns reverse of that string if the step is negative? To simplify, how does negative step work internally?

2 Answers 2

3

When a negative step value is provided, Python swaps the start and stop values. Furthermore, when no start and stop values are provided, they default to the beginning and the end of the sequence.

Given s[i:j:k], the following quote from the Common Sequence Operations section of the Built-in Types documentation applies:

If i or j are omitted or None, they become “end” values (which end depends on the sign of k).


In regards to how it works under the hood, in CPython there are two functions for handling list subscripts, list_subscript() (for reading) and list_ass_subscript() (for assigning).

In both of those functions, after verifying that the subscript specifies a slice, calls are made to PySlice_Unpack() and PySlice_AdjustIndices() to extract and normalize the start and stop values.

Start value handling

From PySlice_Unpack():

if (r->start == Py_None) {
    *start = *step < 0 ? PY_SSIZE_T_MAX : 0;

If the start value is None and the step value is negative, the start value is set to the largest possible value.

Then, in PySlice_AdjustIndices():

else if (*start >= length) {
    *start = (step < 0) ? length - 1 : length;

If the start value is greater then the length of the list (which it undoubtedly is due to the assignment above) and the step value is negative, then the start value is set to length - 1 (i.e. length refers to the length of the sequence).

Stop value handling

From PySlice_Unpack():

if (r->stop == Py_None) {
    *stop = *step < 0 ? PY_SSIZE_T_MIN : PY_SSIZE_T_MAX;

If the stop value is None and the step value is negative, the stop value is set to the smallest possible value.

Then, in PySlice_AdjustIndices():

if (*stop < 0) {
    *stop = (step < 0) ? -1 : 0;

If the stop value is negative (which it is due to the assignment above) and the step value is negative, the stop value is set to -1.

So with an input of string[::-1], you end up with:

  • Start value: len(string) - 1
  • Stop value: -1
  • Step: -1
Sign up to request clarification or add additional context in comments.

3 Comments

The C code you linked to is a bit misleading. That's the code for del some_list[x:y:-z], not for getting a slice. When deleting, it doesn't matter what order you iterate in, so the code flips the arguments around. It doesn't do that in the code that reads (or modifies in-place) slices. Check out a few lines up for the reading code or a few further down for the modifying code.
Good call. I got thrown off course while jumping between code locations. Given that that the read/modifying loops aren't concerned with the sign of the step value, it all boils down to the handling of the start and stop values. The slice handling portions of both list_subscript() and list_ass_subscript() start with calls to PySlice_Unpack() and PySlice_AdjustIndices(), which is where the logic for start/stop value handling lives.
Took another pass at summarizing the relevant parts of the CPython implementation.
1

When a negative value is passed as the step, the defaults "values" for start and stop reverse. start changes from "beginning of sequence" to "end of sequence", and stop changes from "end of sequence" to "beginning of sequence". If it didn't do this, you'd have issues performing a complete slice, since the end of any slice is exclusive; mystr[len(mystr)-1:0:-1] would exclude the first character (because 0 wouldn't be included), and you can't pass -1 instead to go one past 0, because that would just mean the same thing as len(mystr)-1 (thanks to how Python handles negative indices), and you'd slice out nothing at all. Making the defaults switch, so that omitting the end (or equivalently, explicitly passing None) runs all the way to the beginning, inclusive, is the only sensible solution.

3 Comments

But for s = "abc" you are saying s[::-1] is equivalent to s[len(s) : 0 : -1]. So shouldn't that error since theres nothing at s[len(s)]? *Considering the start is inclusive
@malibu: Nope. The start corresponds to the end of the sequence (which is len(seq) - 1), not the length. And the end doesn't have a numeric value at all; it literally cannot be expressed as anything other than a blank or None (it's like -1 if -1 didn't mean len(seq)-1 in Python's indexing/slicing semantics), because the end is exclusive (so 0 would exclude the first character of the original string). So the end with a negative step can be thought of as "end is just past 0" (with the just past 0 omitted since the end is exclusive), but with no way to describe it numerically.
@malibu: Forward slicing has a similar problem, namely, expressing the end of the slice as a negative value. If you want to stop i characters shy of the end, you can use seq[:-i] unless i is 0 (which would cause you to get an empty sequence, since you'd slice nothing out). So you either do the more expensive work all the time to slice to len(seq) - i, or use a klunky seq[:-i] if i else seq[:], or you get tricky.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.