Why does integer division round down in many scripting languages?

Question

In the languages I have tested, - (x div y ) is not equal to -x div y; I have tested // in Python, / in Ruby, div in Perl 6; C has a similar behavior.

That behavior is usually according to spec, since div is usually defined as the rounding down of the result of the division, however it does not make a lot of sense from the arithmetic point of view, since it makes div behave in a different way depending on the sign, and it causes confusion such as this post on how it is done in Python.

Is there some specific rationale behind this design decision, or is just div defined that way from scratch? Apparently Guido van Rossum uses a coherency argument in a blog post that explains how it is done in Python, but you can have coherency also if you choose to round up.

(Inspired by this question by PMurias in the #perl6 IRC channel)

FWIW, in Python // is called floor division. Try this: .7 // .1. Note that it doesn't evaluate to an int. — PM 2Ring
– PM 2Ring, Commented May 20, 2018 at 9:44
FWIW, in Python you can use double negation to get ceiling division: eg -(-23 // 10) — PM 2Ring
– PM 2Ring, Commented May 20, 2018 at 10:34

chi · Accepted Answer · 2018-05-21 09:42:25Z

46

Ideally, we would like to have two operations div and mod, satisfying, for each b>0:

(a div b) * b + (a mod b) = a
0 <= (a mod b) < b
(-a) div b = -(a div b)

This is, however, a mathematical impossibility. If all the above were true, we would have

1 div 2 = 0
1 mod 2 = 1

since this is the unique integer solution to (1) and (2). Hence, we would also have, by (3),

0 = -0 = -(1 div 2) = (-1) div 2

which, by (1), implies

-1 = ((-1) div 2) * 2 + ((-1) mod 2) = 0 * 2 + ((-1) mod 2) = (-1) mod 2

making (-1) mod 2 < 0 which contradicts (2).

Hence, we need to give up some property among (1), (2), and (3).

Some programming languages give up (3), and make div round down (Python, Ruby).

In some (rare) cases the language offers multiple division operators. For instance, in Haskell we have div,mod satisfying only (1) and (2), similarly to Python, and we also have quot,rem satisfying only (1) and (3). The latter pair of operators rounds division towards zero, at the price of returning negative remainders, e.g., we have (-1) `quot` 2 = 0 and (-1) `rem` 2 = (-1).

C# also gives up (2), and allows % to return a negative remainder. Coherently, integer division rounds towards zero. Java, Scala, Pascal, and C, starting from C99, also adopt this strategy.

edited May 21, 2018 at 9:42

answered May 20, 2018 at 11:47

chi

117k4 gold badges143 silver badges230 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Eric Lippert Over a year ago

C# gives up (2). The % operator in C# is not the "mod" operator. It is the "remainder" operator, and a remainder can be negative.

Heinzi Over a year ago

@EricLippert: Just out of curiosity: If that's the case, why is the operator's id string op_Modulus instead of op_Remainder?

Eric Lippert Over a year ago

@Heinzi: No good reason that I have ever been able to determine. It is just one more on a very long list of small mistakes that don't really matter.

Chromatix · Accepted Answer · 2018-05-20 15:39:18Z

41

Floating-point operations are defined by IEEE754 with numeric applications in mind and, by default, round to the nearest representable value in a very strictly-defined manner.

Integer operations in computers are not defined by general international standards. The operations granted by languages (especially those of the C family) tend to follow whatever the underlying computer provides. Some languages define certain operations more robustly than others, but to avoid excessively difficult or slow implementations on the available (and popular) computers of their time, will choose a definition that follows its behaviour quite closely.

For this reason, integer operations tend to wrap around on overflow (for addition, multiplication, and shifting-left), and round towards negative infinity when producing an inexact result (for division, and shifting-right). Both of these are simple truncation at their respective end of the integer in two's-complement binary arithmetic; the simplest way to handle a corner-case.

Other answers discuss the relationship with the remainder or modulus operator that a language might provide alongside division. Unfortunately they have it backwards. Remainder depends on the definition of division, not the other way around, while modulus can be defined independently of division - if both arguments happen to be positive and division rounds down, they work out to be the same, so people rarely notice.

Most modern languages provide either a remainder operator or a modulus operator, rarely both. A library function may provide the other operation for people who care about the difference, which is that remainder retains the sign of the dividend, while modulus retains the sign of the divisor.

answered May 20, 2018 at 15:39

Chromatix

1,1518 silver badges12 bronze badges

12 Comments

iGian Over a year ago

This is an interesting point, also because you refer to modular arithmetics. Do you think the same answer could apply if the question would have been: "Why does integer division rounds up in many scripting languages?"

Chromatix Over a year ago

@iGian Well the fact is that most languages don't round up for integer division, for the reason I outlined in my answer. Usually they either call the CPU's DIV instruction and just accept what it does, or they call a subroutine which implements Euclidean division and normally does the same thing. A reasonable question might be "why does language X round up (or to nearest, or towards zero) when most other languages don't?" - but you'd have to specify X, and the answer would be specific to that language.

iGian Over a year ago

So, many scripting language round down integer division because the CPU DIV is basically an implementation Euclidean division? ---- I'm getting curious about why it should need to round up. Can you suggest a language X which does it?

Chromatix Over a year ago

@iGian That's just it, I don't know of one offhand - you were the one to bring it up. There are several which silently coerce integer operands to floating-point and perform a floating-point division unless you explicitly specify otherwise; eg. BBC BASIC defines / as FP div, DIV as integer with round-down. The 6502 didn't even have a DIV instruction, so this is with a subroutine.

Chromatix Over a year ago

@PeterCordes: Actually "truncation" only describes rounding towards zero in sign-magnitude representations, such as IEEE-754 floating point. In two's complement representation, truncation is effectively rounding towards negative infinity, and that is what you get from the simple expedient of shifting in copies of the sign bit at the most-significant end. In particular, if 0xFFFF is -1, then 0xFFFE is -2.

|

Paula Thomas · Accepted Answer · 2018-05-20 09:41:50Z

12

Because the implication of integer division is that the full answer includes a remainder.

answered May 20, 2018 at 9:41

Paula Thomas

1,1901 gold badge10 silver badges15 bronze badges

4 Comments

jjmerelo Over a year ago

... and the remainder must be always positive by definition? Apparently, it's not the case in every language: in C99 the remainder has the same sign as the dividend

Paula Thomas Over a year ago

Yes, that is correct and the lower limit o length of comments is driving me mad!

PM 2Ring Over a year ago

@jjmerelo In Python, the remainder (either with % or divmod()) always has the same sign as the divisor. Eg 123 % -10 -> -7

jjmerelo Over a year ago

@SeanFrancisN.Ballais it's just a matter of convention, apparently. The wikipedia entry on modulo operation says: When either a or n is negative, the naive definition breaks down and programming languages differ in how these values are defined.

abarnert · Accepted Answer · 2018-05-20 18:27:04Z

Wikipedia has a great article on this, including history as well as theory.

As long as a language satisfies the Euclidean division property that (a/b) * b + (a%b) == a, both flooring division and truncating division are coherent and arithmetically sensible.

Of course people like to argue that one is obviously correct and the other is obviously wrong, but it has more the character of a holy war than a sensible discussion, and it usually has more to do with the choice of their early preferred language than anything else. They also often tend to argue primarily for their chosen %, even though it probably makes more sense to choose / first and then just pick the % that matches.

Flooring (like Python):
- No less an authority than Donald Knuth suggests it.
- % following the sign of the divisor is apparently what about 70% of all students guess
- The operator is usually read as mod or modulo rather than remainder.
- "C does it"—which isn't even true.¹
Truncating (like C++):
- Makes integer division more consistent with IEEE float division (in default rounding mode).
- More CPUs implement it. (May not be true at different times in history.)
- The operator is read modulo rather than remainder (even though this actually argues against their point).
- The division property conceptually is more about remainder than modulus.
- The operator is read mod rather than modulo, so it should follow Fortran's distinction. (This may sound silly, but may have been the clincher for C99. See this thread.)
"Euclidean" (like Pascal—/ floors or truncates depending on signs, so % is never negative):
- Niklaus Wirth argued that nobody is ever surprised by positive mod.
- Raymond T. Boute later argued that you can't implement Euclidean division naively with either of the other rules.

A number of languages provide both. Typically—as in Ada, Modula-2, some Lisps, Haskell, and Julia—they use names related to mod for the Python-style operator and rem for the C++-style operator. But not always—Fortran, for example, calls the same things modulo and mod (as mentioned above for C99).

We don't know why Python, Tcl, Perl, and the other influential scripting languages mostly chose flooring. As noted in the question, Guido van Rossum's answer only explains why he had to choose one of the three consistent answers, not why he picked the one he did.

However, I suspect the influence of C was key. Most scripting languages are (at least initially) implemented in C, and borrow their operator inventory from C. C89's implementation-defined % is obviously broken, and not suitable for a "friendly" language like Tcl or Python. And C calls the operator "mod". So they go with modulus, not remainder.

_{1. Despite what the question says—and many people using it as an argument—C actually doesn't have similar behavior to Python and friends. C99 requires truncating division, not flooring. C89 allowed either, and also allowed either version of mod, so there's no guarantee of the division property, and no way to write portable code doing signed integer division. That's just broken.}

some Lisps: Common Lisp defines mod and rem, see lispworks.com/documentation/HyperSpec/Body/f_mod_r.htm and lispworks.com/documentation/HyperSpec/Body/f_floorc.htm
@coredump And some other Lisps use other names. MacLisp only provided one, and Racket does the same today. Scheme provides three (modulo, remainder, and Wirth-style mod). I don't think the answer needs to link to every dialect of every language in the world; saying that "some Lisps" use mod/rem-style naming seems sufficient.

iGian · Accepted Answer · 2018-05-20 16:57:03Z

7

As Paula said, it is because of the remainder.

The algorithm is founded on Euclidean division.

In Ruby, you can write this rebuilding the dividend with consistency:

puts (10/3)*3 + 10%3
#=> 10

It works the same in real life. 10 apples and 3 people. Ok you can cut one apple in three, but going outside the set integers.

With negative numbers the consistency is also kept:

puts (-10/3)*3 + -10%3 #=> -10
puts (10/(-3))*(-3) + 10%(-3) #=> 10
puts (-10/(-3))*(-3) + -10%(-3) #=> -10

The quotient is always round down (down along the negative axis) and the reminder follows:

puts (-10/3) #=> -4
puts -10%3 #=> 2

puts (10/(-3)) #=> -4
puts 10%(-3) # => -2

puts (-10/(-3)) #=> 3
puts -10%(-3) #=> -1

edited May 20, 2018 at 16:57

answered May 20, 2018 at 10:07

iGian

11.2k3 gold badges24 silver badges38 bronze badges

3 Comments

jjmerelo Over a year ago

So the division and modulo operation must be coherent, that's quite clear. But you can choose rounding up or down and still keep it coherent as long as those equalities hold.

iGian Over a year ago

@jjmerelo, I agree with you, mathematically you can choose a different rule, where # 10 / 3 = 4 # 10 % 3 = -2 # 3 * 4 - 2 = 10. But this can not work in set of natural numbers, where -2 does not exist.

Alexey Romanov Over a year ago

@iGian What your answer is missing is that there is more than one way to keep the consistency for negative numbers without changing results for positive numbers.

codesections · Accepted Answer · 2022-02-10 16:05:15Z

This answer addresses a sub-part of the question that the other (excellent) answers didn't explicitly address. You noted:

you can have coherency also if you choose to round up.

Other answers addressed the choice between rounding down (towards -∞) and truncating (rounding towards 0) but didn't compare rounding up (towards ∞).

^{(The accepted answer touches on performance reasons to prefer rounding down on a two's-complement machine, which would also apply in comparison to rounding up. But there are more important semantic reasons to avoid rounding up.)}

This answer directly addresses why rounding up is not a great solution.

Rounding up breaks elementary-school expectations

Building on an example from a previous answer's, it's common to informally say something like this:

If I evenly divide fourteen marbles among three people, each person gets four marbles and there are two marbles left over.

Indeed, this is how many students are first taught division (before being introduced to fractions/decimals). A student might write 14 ÷ 3 = 4 remainder 2. Since this is introduced so early, we'd really like our div operator to preserve this property.

Or, put a bit more formally, of the three properties discussed in the top-voted answer, the first one ((a div b) × b + (a mod b) = a) is by far the most important.

But rounding up breaks this property. If div rounds up, then 14 div 3 returns 5. This means that the equation above simplifies to 15 + (13 mod 4) = 13 – and that's not true for any definition of mod. Similarly, the less-formal/elementary-school approach is also out of luck – or at least requires introducing negative marbles: "Each person gets 5 marbles and there are negative one marbles left over".

^{(Rounding to the nearest integer also breaks the property when, as in the example above, that means rounding up.)}

Thus, if we want to maintain elementary expectations, we cannot round up. And with rounding up off the table, the coherency argument that you linked in the question is sufficient to justify rounding down.

Collectives™ on Stack Overflow

Why does integer division round down in many scripting languages?

6 Answers 6

3 Comments

12 Comments

4 Comments

2 Comments

3 Comments

Rounding up breaks elementary-school expectations

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

3 Comments

12 Comments

4 Comments

2 Comments

3 Comments

Rounding up breaks elementary-school expectations

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related