0

I try to read a 500M file in C++ using OpenMP. I split the file into some blocks and load "killed" record each time in parallel,repeat.

I'm working on ubuntu using gcc to compile the file(g++ mytest.cpp -o mytest -fopenmp).

Here I provide the code:

(I've deleted some of my code to make the ReadFile function more prominently.)

map<unsigned int, int> Node2Num;
typedef struct {
    unsigned int num;
    set<unsigned int>adj;
}node; 
node Gnode[4800000];

void ReadFile()
{
    ifstream input("./soc-LiveJournal1.bin", ios::in | ios::binary);
    cout << "start to read the file..." << endl;

    //to get the size of the file
    input.seekg(0, ios_base::end);
    size_t fileSize = input.tellg();
    input.seekg(0, ios_base::beg);

    //to split the file into ReadTime blocks
    int EdgesInFile = fileSize / 8; 
    int ReadTime = EdgesInFile / PerEdges;
    unsigned int buffer[2*PerEdges];

    int tid, i, j, killed;
    unsigned int src, des;
    volatile int cnt = 0; //all the nodes stored in the file

    #pragma omp parallel for num_threads(16) private(i,buffer,killed,j,src,des,kk,tid) firstprivate(ReadTime,EdgesInFile)
    for(i = 0;i < ReadTime+1;i++){
        #pragma omp critical
        {
            input.read((char*)buffer, sizeof(buffer));
            cout<<"Thread Num:"<<omp_get_thread_num()<<" Read Time:"<<i<<endl;
        }
        killed = PerEdges;
        if(i == ReadTime) 
            killed = EdgesInFile - ReadTime*PerEdges; 
        for(j = 0;j < killed;j++) {
            src = (unsigned int)buffer[j];
            des = (unsigned int)buffer[j+1];
            #pragma omp critical
            {
            //to store the src and des... 
            }
        }
    }
    cout << "finish the reading!" << endl;
    input.close();
}

int main()
{
    clock_t T1 = clock();
    ReadFile(); 
    clock_t T2 = clock();
    cout<< "Reading Time:" << (double)(T2 - T1) / CLOCKS_PER_SEC << "seconds" << endl;
    return 0;
}

The file I read in my code stores a graph,consisting of continuous lines,each line(8bytes)includes two continuous nodes——source node(4bytes) and destination node(4bytes).Node number are stored as unsigned int type.

But I couldn't get any acceleration by adding the #pragma clause.The reading time has nothing to do with OpenMP,nor does it have a connection with the num_threads I set in the #pragma clause .The reading time is the same,about 200 seconds.

Could anyone tell me where is problem?

4
  • If this is disk reading, your bottleneck will be the disk read speed, and not the processing speed, since it looks like your algorithm isn't terrible. Can you get the timing of the actual load into memory, instead of the total algorithm? Commented Oct 14, 2018 at 4:49
  • I think "ifstream input("./soc-LiveJournal1.bin", ios::in | ios::binary);" this clause loads the file from disk to main memory in one time, and I what I'm trying to do is to read the file from different memory position in parallel. Commented Oct 14, 2018 at 4:59
  • No, ifstream doesn't read the whole file into the memory in advance. Normally, actual reading is performed only when you call read. Anywhere, you are performing the reading under a critical section, and your parallel code has consist primarily of such sections. Note, that only non-critical sections can be speed up using parallel threads. Commented Oct 14, 2018 at 7:25
  • OK.I see, thx a lot! Commented Oct 17, 2018 at 4:21

1 Answer 1

1

You are right adding #pragma omp critical before I/O operations !

However

The omp critical directive identifies a section of code that must be executed by a single thread at a time.

Most of your code inside your loop is in omp critical section. And the rest of your code is not CPU consuming.

Therefore what you observe is normal.

You will hardly get improved performance on I/O operation with using OpenMP unless you are using parallel filesystem.
OpenMP is really for improving CPU intensive section.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you.But I think the #pragma clause before I/O operations is useless...bsc input.read(...) has a thread protection mechanism,which makes the read operation critical by nature.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.