1

I have a task to print all the prime numbers between 1 and 1000000 in class and the fastest 10 programs get extra marks. The main problem is the time it takes for the prime numbers to be printed to the console.

Basically using the Sieve of Eratosthenes I produce an array with only boolean values in it. The boolean value Numbers[i] is true if i+2 is a prime number.

for(i = 0; i <= n - 2; ++i)
   if (Numbers[i]) // True if the number is prime
        printf("%d\n", i+2);

Printf seems to be really slow as the program can generate the list of primes in about 0.035 s but then takes a further 11 seconds to print the list. Is there anyway I can speed this up, thanks.

17
  • 2
    I don't think because if you don't use printf compiler just wipe your calculation because you don't use it. Try to write in a file. Commented Nov 3, 2017 at 1:27
  • 3
    stdout is line buffered by default, so every time you hit the new line you will encounter the delay. Try doing fprintf to a file instead and see if that's faster. Commented Nov 3, 2017 at 1:43
  • 1
    also... check out this answer: stackoverflow.com/a/11558934/1212725 Commented Nov 3, 2017 at 1:47
  • 3
    @bruceg: stdout is line-buffered if it's going to an interactive device. Redirecting the program's output to a file is likely to be faster. Output to the console is probably going to be limited by the speed of the console. Commented Nov 3, 2017 at 2:07
  • 2
    (a) Show all of your code. (b) Try printing only the last prime found. This will prevent the compiler from optimizing away your code, as it does when there is no printf in the program. I predict you will find the program is still slow even when it has only this one printf in it, and that proves it is the prime-computation that is taking the time, not the printf. Commented Nov 3, 2017 at 2:31

2 Answers 2

2

Since by default console output is line buffered, which is the reason of the increased time. You can use the setvbuf function to allow printing to console/stdout only in chunks rather than for each iteration.

E.g.

char buffer[256];
setvbuf(stdout, buffer, _IOFBF, sizeof(buffer));

You can alter the size of buffer according to your needs.

IOFBF option is for full buffering i.e. output will be printed once the buffer is full.

See setvbuf for more details

Sign up to request clarification or add additional context in comments.

5 Comments

I predict the printf will not be the cause of the execution time, in which case this answer will be voted down. Likely it will be discovered OP’s program is inefficient.
Going by the numbers OP mentioned, the program can generate the list of primes in about 0.035 s but then takes a further 11 seconds to print the list. Also IMHO, in case of long outputs, time is consumed in scrolling(by the system)
The reason the program runs quickly without printf but slowly with it has been explained: The compiler optimizes away the program.
@EricPostpischil Thanks. I missed that.
@EricPostpischil Yes but the title of the question is "How to speed up printf in C" which in itself is an important and crystal-clear question, so lots of people come here for an answer to that question.
1

Beneath is a slightly unoptimized implementation (although I skipped the intermediate list and print directly) of what I think you were supposed to do. Running that program on an AMD A8-6600K with a small load (mainly a Youtube music-video for some personal entertainment) results in

real    0m1.211s
user    0m0.047s
sys     0m0.122s

averaged over a couple of runs. So the problem lies in your implementation of the sieve or you are hiding some essential facts about your hardware.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>
#include <limits.h>
#include <string.h>

/* I call it a general bitset. Others might call it an abomination. YMMV. */
#   define ERAT_BITS (sizeof(uint32_t)*CHAR_BIT)
#   define GET_BIT(s,n)  ((*(s+(n/ERAT_BITS)) &   ( 1<<( n % ERAT_BITS ))) != 0)
#   define SET_BIT(s,n)   (*(s+(n/ERAT_BITS)) |=  ( 1<<( n % ERAT_BITS )))
#   define CLEAR_BIT(s,n) (*(s+(n/ERAT_BITS)) &= ~( 1<<( n % ERAT_BITS )))
#   define TOG_BIT(s,n)   (*(s+(n/ERAT_BITS)) ^=  ( 1<<( n % ERAT_BITS )))
/* size is the size in bits, the overall size might be bigger */
typedef struct mp_bitset_t {
    uint32_t size;
    uint32_t *content;
} mp_bitset_t;
#   define mp_bitset_alloc(bst, n) \
  do {\
      (bst)->content=malloc(( n /(sizeof(uint32_t)) + 1 ));\
      if ((bst)->content == NULL) {\
          fprintf(stderr, "memory allocation for bitset failed");\
          exit(EXIT_FAILURE);\
        }\
      (bst)->size = n;\
  } while (0)
#   define mp_bitset_size(bst)  ((bst)->size)
#   define mp_bitset_setall(bst) memset((bst)->content,~(uint32_t)(0),\
   (bst->size /(sizeof(uint32_t) ) +1 ))
#   define mp_bitset_clearall(bst) memset((bst)->content,0,\
   (bst->size /(sizeof(uint32_t) ) +1 ))
#   define mp_bitset_clear(bst,n) CLEAR_BIT((bst)->content, n)
#   define mp_bitset_set(bst,n)     SET_BIT((bst)->content, n)
#   define mp_bitset_get(bst,n)     GET_BIT((bst)->content, n)
#   define mp_bitset_free(bst) \
  do {\
     free((bst)->content);\
     free(bst);\
  } while (0)

uint32_t mp_bitset_nextset(mp_bitset_t * bst, uint32_t n);
uint32_t mp_bitset_prevset(mp_bitset_t * bst, uint32_t n);
void mp_eratosthenes(mp_bitset_t * bst);


/* It's called Hallek's method but it has many inventors*/
static uint32_t isqrt(uint32_t n)
{
   uint32_t s, rem, root;
   if (n < 1)
      return 0;
   /* This is actually the highest square but it goes
    * downward from this, quite fast */
   s = 1 << 30;
   rem = n;
   root = 0;
   while (s > 0) {
      if (rem >= (s | root)) {
         rem -= (s | root);
         root >>= 1;
         root |= s;
      } else {
         root >>= 1;
      }
      s >>= 2;
   }
   return root;
}

uint32_t mp_bitset_nextset(mp_bitset_t *bst, uint32_t n)
{
   while ((n < mp_bitset_size(bst)) && (!mp_bitset_get(bst, n))) {
      n++;
   }
   return n;
}

/*
 * Standard method, quite antique now, but good enough for the handful
 * of primes needed here.
 */
void mp_eratosthenes(mp_bitset_t *bst)
{
   uint32_t n, k, r, j;

   mp_bitset_setall(bst);
   mp_bitset_clear(bst, 0);
   mp_bitset_clear(bst, 1);

   n = mp_bitset_size(bst);
   r = isqrt(n);
   for (k = 4; k < n; k += 2)
      mp_bitset_clear(bst, k);
   k = 0;
   while ((k = mp_bitset_nextset(bst, k + 1)) < n) {
      if (k > r) {
         break;
      }
      for (j = k * k; j < n; j += k * 2) {
         mp_bitset_clear(bst, j);
      }
   }
}

#define UPPER_LIMIT 1000000 /* one million */

int main(void) {
  mp_bitset_t *bst;
  uint32_t n, k, j;

  bst = malloc(sizeof(mp_bitset_t));
  if(bst == NULL) {
    fprintf(stderr, "failed to allocate %zu bytes\n",sizeof(mp_bitset_t));
    exit(EXIT_FAILURE);
  }
  mp_bitset_alloc(bst, UPPER_LIMIT);

  mp_bitset_setall(bst);
  mp_bitset_clear(bst, 0);      // 0 is not prime b.d.
  mp_bitset_clear(bst, 1);      // 1 is not prime b.d.

  n = mp_bitset_size(bst);
  for (k = 4; k < n; k += 2) {
    mp_bitset_clear(bst, k);
  }
  k = 0;

  while ((k = mp_bitset_nextset(bst, k + 1)) < n) {
    printf("%" PRIu32 "\n", k);
    for (j = k * k; j < n; j += k * 2) {
      mp_bitset_clear(bst, j);
    }
  }
  mp_bitset_free(bst);

  return EXIT_SUCCESS;
}

Compiled with

gcc-4.9 -O3 -g3 -W -Wall -Wextra -Wuninitialized -Wstrict-aliasing -pedantic  -std=c11 tests.c -o tests

(GCC is gcc-4.9.real (Ubuntu 4.9.4-2ubuntu1~14.04.1) 4.9.4)

5 Comments

What's with the do{...}while(0) blocks? Looks like code noise to me.
You really shouldn't be doing the OP's homework assignment for them.
@jwdonahue: The do { ... } while(0) idiom is how you write a macro that can be used in any context where a statement can appear. See the comp.lang.c FAQ, question 10.4.
@jwdonahue I'm pretty sure that code is not suitable as a solution for the homework of a beginner and, yes, it was done with that goal in mind. It was meant as a proof that it's not printfbut OP's implementation of the sieve. There are already 2 close votes as I write this. I think I'll give OP a couple of hours to come up with their code (Earth is a sphere) and add my close vote otherwise if still necessary.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.