8

Is there any way to get metadata from pdf document like author or title using pdf.js?

In this example : http://mozilla.github.io/pdf.js/web/viewer.html?file=compressed.tracemonkey-pldi-09.pdf

<div class="row">
<span data-l10n-id="document_properties_author">
    Autor:
</span>
<p id="authorField">
    -
</p>

And the authorField is empty. Is there any way to get this info?

2
  • Can you include a snippet of the code you use or something? Commented Mar 30, 2014 at 12:20
  • The PDF does not have the author field populated. Display a different document, e.g. mozilla.github.io/pdf.js/web/viewer.html?file=/deuxdrop/… Commented Mar 31, 2014 at 13:23

4 Answers 4

13

Using just the PDF.js library without a thirdparty viewer, you can get metadata like so, utilizing promises.

PDFJS.getDocument(url).then(function (pdfDoc_) {
        pdfDoc = pdfDoc_;   
        pdfDoc.getMetadata().then(function(stuff) {
            console.log(stuff); // Metadata object here
        }).catch(function(err) {
           console.log('Error getting meta data');
           console.log(err);
        });

       // Render the first page or whatever here
       // More code . . . 
    }).catch(function(err) {
        console.log('Error getting PDF from ' + url);
        console.log(err);
    });

I found this out after dumping the pdfDoc object to the console and looking through its functions and properties. I found the method in its prototype and decided to just give it a shot. Lo and behold it worked!

Sign up to request clarification or add additional context in comments.

5 Comments

I presume that your phrase "utilizing promises" is a mistake which was introduced during spellchecking? :)
To 'view' the object contents you can: console.log(JSON.stringify(stuff,null,2))
Where does "PDFJS" come from, its all undefined when I try to use it.
@mondjunge I wrote this answer over 5 years ago, unfortunately, and don't do JavaScript work anymore. It's possible the library has since been updated. Maybe check out some of the newer example code here? mozilla.github.io/pdf.js/examples
I included pdf.js from its github page and now it works. Guess the Firefox included pdf.js has some kind of protective layer?!
2

You can get document basic metadata info from PDFViewerApplication.documentInfo object. For eg: to get Author use PDFViewerApplication.documentInfo.Author

Comments

0
pdfDoc.getMetadata(url).then(function(stuff) {
    var metadata = stuff.info.Title;
    if (metadata) {
        $('#element-html').text(stuff.info.Title); // Print metadata to html
    }
console.log(stuff); // Print metadata to console
}).catch(function(err) {
     console.log('Error getting meta data');
     console.log(err);
});

Comments

0

try:

await getDocument(url).promise.then(doc => doc.getMetadata())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.