6

I have a desire to add a property with a default value to a set of documents that I retrieve via a SELECT query if they contain no value.

I was thinking of this in two parts:

  1. SELECT * FROM c article WHERE article.details.locale = 'en-us'

I'd like to find all articles where article.details.x does not exist.

  1. Add the property, article.details.x = true

I was hoping this EXEC command could be supported via the Azure Portal so I don't have to create a migration tool to run this command once but I couldn't find this option in the portal. Is this possible?

0

2 Answers 2

8

You can use Azure Document DB Studio as a front end to creating and executing a stored procedure. It can be found here. It's pretty easy to setup and use.

I've mocked up a stored procedure based on your example:

function updateArticlesDetailsX() {

   var collection = getContext().getCollection();
   var collectionLink = collection.getSelfLink();
   var response = getContext().getResponse();
   var docCount = 0;
   var counter = 0;

   tryQueryAndUpdate();

   function tryQueryAndUpdate(continuation) {
        var query = {
            query: "select * from root r where IS_DEFINED(r.details.x) != true"
        };

        var requestOptions = {
            continuation: continuation
        };

        var isAccepted =
            collection
            .queryDocuments(collectionLink,
                            query,
                            requestOptions,
                            function queryCallback(err, documents, responseOptions) {
                                     if (err) throw err;
                                     if (documents.length > 0) {
                                        // If at least one document is found, update it.
                                        docCount = documents.length;
                                        for (var i=0; i<docCount; i++){
                                            tryUpdate(documents[i]);
                                        }
                                        response.setBody("Updated " + docCount + " documents");
                                      }
                                      else if (responseOptions.continuation) {
                                          // Else if the query came back empty, but with a continuation token; 
                                          // repeat the query w/ the token.
                                        tryQueryAndUpdate(responseOptions.continuation);
                                      } else {
                                             throw new Error("Document not found.");
                                             }
                            });

        if (!isAccepted) {
            throw new Error("The stored procedure timed out");
        }
    }

    function tryUpdate(document) {
        //Optimistic concurrency control via HTTP ETag.
        var requestOptions = { etag: document._etag };

        //Update statement goes here:
        document.details.x = "some new value";

        var isAccepted = collection
                         .replaceDocument(document._self,
                                          document,
                                          requestOptions,
                                          function replaceCallback(err, updatedDocument, responseOptions) {
                                                   if (err) throw err;
                                                   counter++;
                                           });

        // If we hit execution bounds - throw an exception.
        if (!isAccepted) {
            throw new Error("The stored procedure timed out");
        }
    }
}

I got the rough outline for this code from Andrew Liu on GitHub.

This outline should be close to what you need to do.

Sign up to request clarification or add additional context in comments.

1 Comment

does this have an issue? will it just update the first set of documents found in the first page? if (documents.length > 0)
6

DocumentDB has no way in a single query to update a bunch of documents. However, the portal does have a Script Explorer that allows you to write and execute a stored procedure against a single collection. Here is an example sproc that combines a query with a replaceDocument command to update some documents that you could use as a starting point for writing your own. The one gotcha to keep in mind is that DocumentDB will not allow sprocs to run longer than 5 seconds (with some buffer). So you may have to run your sproc multiple times and keep track of what you've already done if it can't complete in one 5 second run. The use of IS_DEFINED(collection.field.subfield) != true (thanks @cnaegle) in your query followed up by a document replacement that defines that field (or removes that document) should allow you to run the sproc as many times as necessary.

If you didn't want to write a sproc, the easiest thing to do would be to export the database using the DocumentDB Data Migration tool. Import that into Excel to manipulate or write a script to do the manipulation. Then upload it again using the Data Migration tool.

2 Comments

Thanks for the heads up Larry, about question 1. Is there any ability for documentDB to query based on columns that do not exist within the JSON structure?
Yes, you can use: IS_DEFINED(articles.details.x) != true

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.