Extracting Text from Powerpoint Using OfficeJS

Question

I'm trying to create an Office Add-in with a function that automatically extracts all text within a powerpoint and/or all text on a current slide on a button click. I'm new to OfficeJS but am unable to locate examples about this topic.

I want to believe that the easiest way to do this would be to iterate through each slide, iterate through all TextFrames on each slide, and read the text value of each textFrame. However, I am unable to find a "read" method on the Office Documentation for the textFrames.

The only thing I have been able to do so far is extract text I have highlighted with my cursor.

async function getSlideText() {
    Office.context.document.getSelectedDataAsync(Office.CoercionType.Text, (asyncResult) => {
        if (asyncResult.status === Office.AsyncResultStatus.Failed) {
            setMessage("Error");
        } else {
            setMessage("Selected the following Text: " + asyncResult.value);
        }
    });
}

Does anyone have any ideas on how to approach this? Thanks in advance!

Edit

I have come up with a solution that so far on a slide, can iterate through all shapes, all shape's textframes, and all textframe's textrange. However, I am getting an error whenever I try to access the textrange's text property

My method:

async function getParsedText() {
await PowerPoint.run(async (context) => {
    // Get the shapes from first slide of ppt
    const sheet = context.presentation.slides.getItemAt(0);
    const shapes = sheet.shapes;

    // Load all the shapes in the collection
    shapes.load();
    await context.sync();

    shapes.items.forEach(function (shape) { //for each shape get tf, tr, and txt
        if (shape.textFrame != null) {
            const tf = shape.textFrame;
            tf.load();

            if (tf != null) {
                console.log("tf is found");
            }
            else {
                console.log("tf is not found");
            }

            const tr = tf.textRange;
            tr.load();

            if (tr != null) {
                console.log("tr is found");
            }
            else {
                console.log("tr is not found");
            }

            
            const txt = tr.text;
            txt.load();
            if (txt != null) {
                console.log("txt is found");
            }
            else {
                console.log("txt is not found");
            }

            console.log("Text in shape: ", txt);

            context.sync();



        }
    });
    await context.sync();
});}

My error:

ncaught (in promise) RichApi.Error: The property 'text' is not available. Before reading the property's value, call the load method on the containing object and call "context.sync()" on the associated request context.

Is there any suggestions on how to resolve this?

Rick Kirkham · Accepted Answer · 2023-12-07 23:54:20Z

0

Each TextFrame has a textRange property which has a text property. Just read that property.

answered Dec 7, 2023 at 23:54

Rick Kirkham

9,9172 gold badges16 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Goody3333 Over a year ago

Is there a way to iterate through all textFrames on each slide?

Rick Kirkham Over a year ago

Yes. Iterate through the slide's ShapesCollection. There's a textFrame on each shape.

Goody3333 Over a year ago

For some reason, I am having issues accessing the text property of a textRange. I have added an edit that includes my current code (credits to your suggestions!)

Rick Kirkham Over a year ago

Your code is reading text before it calls context.sync. You have to call context.sync before you try to read the properties that you have loaded. Also, don't call context.sync in a loop. See Avoid using the context.sync method in loops.

Rick Kirkham Over a year ago

I recommend this book: leanpub.com/buildingofficeaddins

Collectives™ on Stack Overflow

Extracting Text from Powerpoint Using OfficeJS

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related