4

I'm looking at the asm generated from the following C code.

uint8_t anode = lednum / 4;
PORTB = (1 << anode);

I get the following using O2 optimization:

00000040 <setout>:
  40:   86 95           lsr r24
  42:   86 95           lsr r24
  44:   21 e0           ldi r18, 0x01   ; 1
  46:   30 e0           ldi r19, 0x00   ; 0
  48:   08 2e           mov r0, r24
  4a:   01 c0           rjmp    .+2     ; 0x4e
  4c:   22 0f           add r18, r18
  4e:   0a 94           dec r0
  50:   ea f7           brpl    .-6     ; 0x4c
  52:   28 bb           out 0x18, r18   ; 24
  54:   08 95           ret

I get that lsr is just shifting by two (divide by four) but why is all the following work done in two registers? Is it because the 1 is interpreted as an int, 16 bits, or something else. I would imagine only one register, r18 for example, being needed.

[edit1] Added full code

[edit2] According to http://gcc.gnu.org/wiki/avr-gcc#Register_Layout r18 and r19 can be used inside any function (excluding ISR) without being restored. Neither is used outside of the function.

[edit3] Full minimal example.

#include <avr/io.h>
#include <stdint.h>

void
setout(uint8_t lednum)
{
        uint8_t anode = lednum / 4;
        PORTB = (1 << anode);
}

void
main(void)
{
        while(1)
        {
            for (int i = 0; i < 10; ++i)
                setout(i);
        }
}

Gives me:

bin/scanner.elf:     file format elf32-avr


Disassembly of section .text:

00000000 <__vectors>:
   0:   0e c0           rjmp    .+28        ; 0x1e <__ctors_end>
   2:   15 c0           rjmp    .+42        ; 0x2e <__bad_interrupt>
   4:   14 c0           rjmp    .+40        ; 0x2e <__bad_interrupt>
   6:   13 c0           rjmp    .+38        ; 0x2e <__bad_interrupt>
   8:   12 c0           rjmp    .+36        ; 0x2e <__bad_interrupt>
   a:   11 c0           rjmp    .+34        ; 0x2e <__bad_interrupt>
   c:   10 c0           rjmp    .+32        ; 0x2e <__bad_interrupt>
   e:   0f c0           rjmp    .+30        ; 0x2e <__bad_interrupt>
  10:   0e c0           rjmp    .+28        ; 0x2e <__bad_interrupt>
  12:   0d c0           rjmp    .+26        ; 0x2e <__bad_interrupt>
  14:   0c c0           rjmp    .+24        ; 0x2e <__bad_interrupt>
  16:   0b c0           rjmp    .+22        ; 0x2e <__bad_interrupt>
  18:   0a c0           rjmp    .+20        ; 0x2e <__bad_interrupt>
  1a:   09 c0           rjmp    .+18        ; 0x2e <__bad_interrupt>
  1c:   08 c0           rjmp    .+16        ; 0x2e <__bad_interrupt>

0000001e <__ctors_end>:
  1e:   11 24           eor r1, r1
  20:   1f be           out 0x3f, r1    ; 63
  22:   cf e5           ldi r28, 0x5F   ; 95
  24:   d1 e0           ldi r29, 0x01   ; 1
  26:   de bf           out 0x3e, r29   ; 62
  28:   cd bf           out 0x3d, r28   ; 61
  2a:   0d d0           rcall   .+26        ; 0x46 <main>
  2c:   1e c0           rjmp    .+60        ; 0x6a <_exit>

0000002e <__bad_interrupt>:
  2e:   e8 cf           rjmp    .-48        ; 0x0 <__vectors>

00000030 <setout>:
  30:   86 95           lsr r24
  32:   86 95           lsr r24
  34:   21 e0           ldi r18, 0x01   ; 1
  36:   30 e0           ldi r19, 0x00   ; 0
  38:   08 2e           mov r0, r24
  3a:   01 c0           rjmp    .+2         ; 0x3e <__SP_H__>
  3c:   22 0f           add r18, r18
  3e:   0a 94           dec r0
  40:   ea f7           brpl    .-6         ; 0x3c <setout+0xc>
  42:   28 bb           out 0x18, r18   ; 24
  44:   08 95           ret

00000046 <main>:
  46:   40 e0           ldi r20, 0x00   ; 0
  48:   21 e0           ldi r18, 0x01   ; 1
  4a:   30 e0           ldi r19, 0x00   ; 0
  4c:   84 2f           mov r24, r20
  4e:   86 95           lsr r24
  50:   86 95           lsr r24
  52:   b9 01           movw    r22, r18
  54:   02 c0           rjmp    .+4         ; 0x5a <main+0x14>
  56:   66 0f           add r22, r22
  58:   77 1f           adc r23, r23
  5a:   8a 95           dec r24
  5c:   e2 f7           brpl    .-8         ; 0x56 <main+0x10>
  5e:   68 bb           out 0x18, r22   ; 24
  60:   4f 5f           subi    r20, 0xFF   ; 255
  62:   4a 30           cpi r20, 0x0A   ; 10
  64:   98 f3           brcs    .-26        ; 0x4c <main+0x6>
  66:   40 e0           ldi r20, 0x00   ; 0
  68:   f1 cf           rjmp    .-30        ; 0x4c <main+0x6>

0000006a <_exit>:
  6a:   f8 94           cli

0000006c <__stop_program>:
  6c:   ff cf           rjmp    .-2         ; 0x6c <__stop_program>

It looks like it gets inlined, but it still uses the two registers instead of one.

8
  • It appears to be doing the 1 << anode shift using a loop that doubles the value; perhaps the avr instruction set is missing a shift-by-count instruction? r19 doesn't seem to be used in the code you post, so without seeing more it is hard to tell why that is involved. And we don't know if the original value of r24 is being used again, justifying it's copy to another register for the shifting. Or perhaps the processor has constraints on the use of some registers. Commented May 9, 2014 at 16:12
  • Yes, that much I'm aware of. The question is why it is done in r18 and r19 and not only in r18. Commented May 9, 2014 at 16:13
  • @ this is it. after out theres only a ret instruction Commented May 9, 2014 at 16:16
  • That is a bit puzzling. Does r19 have any special role elsewhere in the code? If you replace it with some sort of NOP, does the program still work? Commented May 9, 2014 at 16:17
  • @ChrisStratton It's not used at all actually. According to gcc.gnu.org/wiki/avr-gcc#Register_Layout it can be modified inside any function. It's not used outside the function if I put a call to setout as the only thing in a main loop. Commented May 9, 2014 at 16:21

1 Answer 1

0

gcc 4.9.0 is a little worse, burns two registers plus adds an instruction compared to whatever you are using.

#define PORTB (*(volatile unsigned char *)(0x18+0x20))
void setout(unsigned char lednum)
{
        unsigned char  anode = lednum / 4;
        PORTB = (1 << anode);
}

avr-gcc -O2 -mmcu=avr2 -c fun.c -o fun.o
avr-objdump -D fun.o

00000000 <setout>:
   0:   28 2f           mov r18, r24
   2:   26 95           lsr r18
   4:   26 95           lsr r18
   6:   81 e0           ldi r24, 0x01   ; 1
   8:   90 e0           ldi r25, 0x00   ; 0
   a:   02 2e           mov r0, r18
   c:   00 c0           rjmp    .+0         ; 0xe <setout+0xe>
   e:   88 0f           add r24, r24
  10:   0a 94           dec r0
  12:   02 f4           brpl    .+0         ; 0x14 <setout+0x14>
  14:   88 bb           out 0x18, r24   ; 24
  16:   08 95           ret

I agree with Joachim, I think the 1 is being promoted to something larger. Kind of like the mistake folks make when they:

float a;
...
a = a + 1.0;

If you do this

#define PORTB (*(volatile unsigned char *)(0x18+0x20))
void setout(unsigned char one, unsigned char lednum)
{
        unsigned char  anode = lednum / 4;
        PORTB = (one << anode);
}

I get this

00000000 <setout>:
   0:   66 95           lsr r22
   2:   66 95           lsr r22
   4:   06 2e           mov r0, r22
   6:   00 c0           rjmp    .+0         ; 0x8 <setout+0x8>
   8:   88 0f           add r24, r24
   a:   0a 94           dec r0
   c:   02 f4           brpl    .+0         ; 0xe <setout+0xe>
   e:   88 bb           out 0x18, r24   ; 24
  10:   08 95           ret

I think the compiler is forcing that constant to be a 16 bit value per the rules, but then the optimizer figures out it doesnt need to shift 16 bits. The assignment of the 16 bit constant remains. So I think Joachim nailed it...Post an answer so we can vote for it...

Sign up to request clarification or add additional context in comments.

3 Comments

I know I can do 1L to promote do long. Is there something similar to char?
@evading - I tried a version in which I cast the 1 to a uint8_t and it didn't seem to make a difference; the promotion seems to be a "rules of C language" thing.
@ChrisStratton I tried the same (uint8_t)(1) like you I got the same result. So I was wondering if there was some way to do 1UC, but I've never seen that done.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.