0

I have a large number of records of data in a binary file, and I want to search for something in it. Is there any way that could I use LINQ statement on the file data without putting all the data in memory (likeList<T>)?

I have this methods that use List<Book> :

private Book Read(long position)
{
    Book book;
    using (Stream st = File.Open(HttpContext.Current.Server.MapPath("/") + "library.majid", FileMode.OpenOrCreate, FileAccess.Read))
    {
        st.Position = position;
        using (BinaryReader reader = new BinaryReader(st))
        {
            if (!reader.ReadBoolean())
                return null;
            book = new Book()
            {
                Id = reader.ReadInt32(),
                Name = reader.ReadString(),
                Dewey = reader.ReadString()
            };
            try
            {
                book.Subject = reader.ReadString();
                book.RegDate = reader.ReadInt32();
                book.PubDate = reader.ReadInt32();
            }
                catch (EndOfStreamException) { }
            }
        }
        return book;
    }
        private List<Book> getAll( int recordLength = 100)//sorted results by Id!!
    {
        long Len;
        using (Stream st = File.Open(HttpContext.Current.Server.MapPath("/") + "library.majid", FileMode.OpenOrCreate, FileAccess.Read))
        {
            Len = st.Length;
        }
        List<Book> res = new List<Book>();
        Book ReadedBook = null;
        for (int i = 0; i < Len/100; i++)
        {
            ReadedBook = Read(i * 100);
            if (ReadedBook != null)
                res.Add(ReadedBook);
        }
        res.Sort((x, y) => x.Id.CompareTo(y.Id));
        return res;
    }
8
  • 1
    Consider showing your file structure and some code that shows your current effort. Commented Apr 29, 2013 at 13:30
  • What advantage does not putting data into memory give you? Commented Apr 29, 2013 at 13:31
  • you can't use linq to objects without them being in memory. you could write a custom query provider. Commented Apr 29, 2013 at 13:31
  • @Brad if system memory be small system will be slow Commented Apr 29, 2013 at 13:32
  • @majidgeek I suggest you searching for BinarySerialization. Commented Apr 29, 2013 at 13:46

2 Answers 2

4

If it is a text file, you can use File.ReadLines(filename) which returns IEnumerable<string>, without loading the file to memory.

See http://msdn.microsoft.com/en-us/library/dd383503.aspx

The ReadLines and ReadAllLines methods differ as follows: When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. Therefore, when you are working with very large files, ReadLines can be more efficient.

For ex;

var count = File.ReadLines(somefile)
                .Where(line => line.StartsWith("something"))
                .Count();

EDIT

what If it be a binary file?

Then you can write a method similar to this:

public static IEnumerable<Book> ReadBooks(string filename)
{
    using (var f = File.Open(filename, FileMode.Open))
    {
        using (BinaryReader rdr = new BinaryReader(f))
        {
            Book b = new Book();
            //.....
            yield return b;
        }
    }
}
Sign up to request clarification or add additional context in comments.

3 Comments

what If it be a binary file?
@majidgeek Like in your code. Read from binary file(ReadInt32, ReadString etc.) and assign properties of Book
@majidgeek nowhere. just assign properties at //..... that's all. msdn.microsoft.com/en-us/library/vstudio/9k7k7cf0.aspx
0

If you only want to search for some data you can keep a similar implementation of your method getAll, pass some parameters to perform the search and return a List (or IEnumerable<T>). This way you only keep in memory the result items.

Your Read method will not keep elements in memory (only on the method scope).

By the way you could pass the stream reader to your Read method so you will not create a new reader for each iteration. The stream "cursor" will be keep the position of the last chunk of read data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.