0

My goal is to parallelize a section in my Fortran program. The flow of the program is:

  1. Read data from a file
  2. make some computations
  3. write the results to 2 different files

Here I want to parallelize the writing process since I’m writing into different files.

    module foo
        use omp_lib
        implicit none
        type element
            integer, dimension(:), allocatable          :: v1, v2
            real(kind=8), dimension(:,:), allocatbale   :: M
        end type element

   contains

   subroutine test()
       implicit none
       type(element)      :: e


       do
           e = read_data_from_file()

           call compute_data(e)

           !$OMP SECTIONS
           !$OMP SECTION
           !$ call write_to_file1(e)
           !$OMP SECTION
           !$ call write_to_file2(e)
           !$OMP END SECTIONS
       end do
    end subroutine test 


    ...

    end module foo

But this program isn't going anything faster. So I think that I’m missing something?

6
  • 3
    Don't waste your time parallelising i/o unless you have the hardware to support it. (in my experience people who have such hardware don't ask this kind of question so I'm assuming that you don't.) If you have two threads trying to use one write head at (sort of) the same time you are just going to slow both write operations down while the o/s plays nice and gives each an equal share and you pay the overhead for all those switches from one to the other. Commented Oct 24, 2016 at 16:15
  • It's also not clear from what you've posted whether you have any parallelism in your code at all -- see stackoverflow.com/questions/2770911/… for an explanation. And when you've sorted that out single is likely to be more performant than section for the file writing. Commented Oct 24, 2016 at 16:21
  • 1
    You are missing the OMP PARALLEL directive, is it somewhere hidden? Perhaps you wanted OMP PARALLEL SECTIONS instead? But Mark is right, it will not make it faster anyway because disk operations are hard to parallelize. Commented Oct 24, 2016 at 17:40
  • And please, 1. use tag fortran and 2. use titles which describe your problem, your original title just repeated your 2 tags. Commented Oct 24, 2016 at 17:41
  • Thank you for your replies. My purpose here is/was, since i'm writing into 2 different output files, to split the work and do it in parallel rather than writing the data sequentially to the first file and then the second file. Commented Oct 24, 2016 at 21:31

2 Answers 2

0

In general one can divide scientific computing codes in bandwidth bound and computational bound algorithms. The bandwidth bound algorithms are all that only do few operations on the data they need. Like having O(n) data where O(n) flops are performed on. Thinking of the hard disk speed or the network connection speed, I/O is a bandwidth bound operation as well and therefore not or only badly parallelizable.
If you really want to gain performance out of the parallelization split the code into bandwidth bound and computational bound algorithms and use your time to parallelize the later ones.

Sign up to request clarification or add additional context in comments.

2 Comments

In my case, writing the data into the binary files, takes 80% of the bandwith that's why i'm trying to write to the different files at the same time (in parallel).
Assuming that writing to files is limited by the bandwidth of harddisk interface you will split the bandwidth by parallelizing it. Due to managing tasks for the filesystem and hardware related slow down of classical harddisk during parallel actions that even slow down the program more. One possibility to accelerate the write process is to use memory mapped I/O which is normally much faster. But in this case you have to write your I/O in C and than interface it from Fortran if you really need it there.
0

If you specify you problem more precisely there are hundreds of experts eager to solve it. From the comment to the answer above I see that you are using binary output but still has bandwidth left to write faster, that means that you disk speed is fine and you're not limited by parsing, but rather that you actual program is not putting out data in a faster pace than this.

So optimize your code, to make it catch up with your write-speed, instead of increasing the write speed with an equally slow code.

Writing them 2 files sequentially at the max of your bandwidth is as fast and much easier than writing in parallel (at the same max speed).

If I am mistaken, and you are indeed limited by IO, maybe this other question/answer can help you: How to avoid programs in status D.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.