0

I am using ObjectOutputStream to save object, but when I use .writeObject(this) to save it as a file, the material cannot be saved. The class I defined is already serializable.

public class LanguageModel implements Serializable {


private static LanguageModel lm_;

/* ******************************* */
//word -> count(w)
public static Dictionary unigramDict = new Dictionary();
//word_pair -> count(wi,wi+1)
public static Dictionary bigramDict = new Dictionary();

private static int wordIdCounter = 0;
/* ***************************** */


// Do not call constructor directly since this is a Singleton
private LanguageModel(String corpusFilePath) throws Exception {
    constructDictionaries(corpusFilePath);
}


public void constructDictionaries(String corpusFilePath)
        throws Exception {

    ...
    }

// Saves the object (and all associated data) to disk
public void save() throws Exception{
    FileOutputStream saveFile = new FileOutputStream(Config.languageModelFile);
    ObjectOutputStream save = new ObjectOutputStream(saveFile);
    save.writeObject(this);
    save.close();
}

// Creates a new lm object from a corpus
public static LanguageModel create(String corpusFilePath) throws Exception {
    if(lm_ == null ){
        lm_ = new LanguageModel(corpusFilePath);
    }
    return lm_;
}

}

The class I defined is as follows:

import java.io.Serializable;

import java.util.HashMap;

public class Dictionary implements Serializable {

private int termCount;
private HashMap<String, Integer> map;

public int termCount() {
    return termCount;
}

public Dictionary() {
    termCount = 0;
    map = new HashMap<String, Integer>();
}

...
}

When I try save.writeObject(unigramDict), it can save this variable properly. Since it is a large variable, I can simply check the size of the file. It is 5MB. When I switch to save.writeObject(this), the size of the file is only 53 Bytes.

4
  • 2
    I don't understand the last paragraph. You said that save.writeObject(unigramDict) gives you 5MB, then you said that save.writeObject(unigramDict) gives you 53B. Which is it? By the way, you could use a debugger to check that the Dictionary objects are actually being populated correctly before you save. Commented Apr 24, 2014 at 14:44
  • @DavidWallace I think the second unigramDict may be this, in the first case he save the Map, in the second case he save the LanguageModel object without the static field :) Commented Apr 24, 2014 at 15:02
  • Java Serialization only save non-transient and non-static fields. This is because it is trying to save fields for that instance. Make any fields you want to save non-static. Commented Apr 24, 2014 at 16:02
  • Oh, I didn't see the static modifiers. Whoops. Commented Apr 24, 2014 at 19:03

1 Answer 1

5

I think you're in trouble with the static fields which don't be save with save.writeObject(this).

From the ObjectOutputStream javadoc:

The default serialization mechanism for an object writes the class of the object, the class signature, and the values of all non-transient and non-static fields.

You should simply set unigramDict and bigramDict as non-static field, and access it with LangugageModel.lm_.unigramDict.
Maybe you can look at the singleton pattern instead of set all the field as static.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.