1

I have a Google spreadsheet composed of four worksheets as follows:

group_memberships: key | user_id | group_id
app_user_memberships: key | user_id | app_id
app_group_memberships: key | group_id | app_id
apps: app_id

The logic of this worksheets are as follows:

group_memberships: user_id is assigned to group_id
app_user_memberships: user_id is assigned to app_id
app_group_memberships: group_id is assigned to app_id
apps: contains a list of app_ids

A user can be part of a group and that group is assigned to the application. Once the group is assigned to the application, the user will be assigned to the application.

The query that I am trying to make is in a format like the following:

for each application
    select user IDs where
        group is assigned to the app
        user is assigned to the group
        user is not assigned to the app

The values app_id, group_id and user_id are the same throughout the spreadsheet.

4
  • 1
    Please provide a link to a view only version of your spreadsheet Commented Apr 28, 2021 at 15:53
  • 1
    @JohnA Please see here docs.google.com/spreadsheets/d/… Commented Apr 28, 2021 at 16:04
  • It seems that the shared sheet was already removed.Is your concern has already been resolved from the answer provided by @JohnA? If not yet, could you please share the sample sheet again together with an expected output. Commented Apr 28, 2021 at 18:41
  • 1
    @Ron M: The issue was resolved by John's reply. Commented Apr 29, 2021 at 19:12

1 Answer 1

1

Thanks for providing the sheet. QUERY() is modeled after SQL and is very powerful but it is not a full SQL implementation and does not provide for JOINs, at least not directly. I have seen many complex solutions with formulas that are impossible to understand and debug.

So, I took it on my own to write a custom function with provides INNER, LEFT, RIGHT and FULL JOINs.

Please see my sample spreadsheet and feel free to make a copy

https://docs.google.com/spreadsheets/d/1vziuF8gQcsOxTLEtlcU2cgTAYL1eIaaMTAoIrAS7mnE/edit#gid=0

This is the script which of course is also in the sample ss above

const ss = SpreadsheetApp.getActiveSpreadsheet();



/**
 * Combines two ranges with a common key and can be used standalone or with the QUERY() function to simulate joins.
 *
 * 
 * @constructor
 * @param {(string|array)} range1 - the main table as a named range, a1Notation or an array
 * @param {(string|array)} range2 - the related table as a named range, a1Notation or an array
 * @param {number} primaryKey - the unique identifier for the main table, columns start with "1"
 * @param {number} foreignKey - the key in the related table to join to the main table, columns start with "1"
 * @param {string} joinType, type of join - "Inner", "Left", "Right", "Full", optional and defaults to "Inner", case insensitive
 * @returns {array} array results as a two dimensional array
 * @customfunction
 *
 * Result Set Example:
 *
 * =QUERY(denormalize("Employees","Orders",1,3), "SELECT * WHERE Col2 = 'Davolio' AND Col8=2", FALSE)
 *
 * |EmpID|LastName|FirstName|OrderID|CustomerID|EmpID|OrderDate|ShipperID|
 * |:----|:-------|:--------|:------|:---------|:----|:--------|:--------|
 * |1      |Davolio |Nancy    |10285  |63        |1    |8/20/1996|2        |
 * |1    |Davolio |Nancy    |10292  |81        |1    |8/28/1996|2        |
 * |1    |Davolio |Nancy    |10304  |80        |1    |9/12/1996|2        | 
 * etc.
 *
 * Other Examples:
 * =denormalize("Employees","Orders",1,3)
 * =denormalize("Employees","Orders",1,3,"full")
 * =QUERY(denormalize("Employees","Orders",1,3,"left"), "SELECT * ", FALSE)
 * =QUERY(denormalize("Employees","Orders",1,3), "SELECT * WHERE Col2 = 'Davolio'", FALSE)
 * =QUERY(denormalize("Employees","Orders",1,3), "SELECT * WHERE Col2 = 'Davolio' AND Col8=2", FALSE)
 * =denormalize("Orders","OrderDetails",1,2)
 * // multiple joins 
 * =denormalize("Employees",denormalize("Orders","OrderDetails",1,2),1,3)
 * =QUERY(denormalize("Employees",denormalize("Orders","OrderDetails",1,2),1,3), "SELECT *", FALSE)
 * =denormalize(denormalize("Employees","Orders",1,3),"OrderDetails",1,2)
 * =QUERY(denormalize("Employees",denormalize("Orders","OrderDetails",1,2),1,3), "SELECT *", FALSE)
 * =QUERY(denormalize(denormalize("Employees","Orders",1,3),"OrderDetails",4,2), "SELECT *", FALSE)
 *
 * Joins Types:
 * (INNER) JOIN: Returns records that have matching values in both tables
 * LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from the right table
 * RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from the left table
 * FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table
 *
 * Note: the most common join is INNER which is why that is the default join type
 * 
 * General:
 * This alogithm is more efficient than using nested loops and uses a form of a hash table instead.
 * A hash table is a structure that can map index keys to values and typically resembles something like this:
 * [index][values]
 *
 * Since javascript provides a native function to return the index, there is no need to store it so,
 * this hash table only stores the values.
 *
 * There is minimal testing in DENORMALIZE() to validate parameters.
 *
 * Author/Coder/Tester: John Agusta, 03/28/2021, Raleigh, NC USA
 *
 * License: Follows the GNU General Public License (GNU GPL or simply GPL), a series of widely-used free 
 * software licenses that guarantee end users the freedom to run, study, share, and modify the software.
 *
 * http://www.gnu.org/licenses/gpl.html
 *
 *
 * Note: DENORMALIZE() can simulate multiple joins by nesting DENORMALIZE() functions as needed.
 *
 * Recursion is theoretically possible to unlimited depth, although only a few levels are normally used in practical programs
 * as performance will degrade accordingly.
 *
 * DENORMALIZE(range1, range2, primaryKey, foreignKey, joinType)
 *
 */
function DENORMALIZE(range1, range2, primaryKey, foreignKey, joinType) {
  var i = 0;
  var j = 0;
  var index = -1;
  var lFound = false;
  var aDenorm = [];
  var hashtable = [];
  var aRange1 = "";
  var aRange2 = "";
  joinType = DefaultTo(joinType, "INNER").toUpperCase();
  // the 6 lines below are used for debugging
  //range1 = "Employees";
  //range1 = "Employees!A2:C12";
  //range2 = "Orders";
  //primaryKey = 1;
  //foreignKey = 3;
  //joinType = "LEFT";
  // Sheets starts numbering columns starting with "1", arrays are zero-based
  primaryKey -= 1;
  foreignKey -= 1;
  // check if range is not an array
  if (typeof range1 !== 'object') {
    // Determine if range is a1Notation and load data into an array
    if (range1.indexOf(":") !== -1) {
      aRange1 = ss.getRange(range1).getValues();
    } else {
      aRange1 = ss.getRangeByName(range1).getValues();
    } 
  } else {
    aRange1 = range1;
  }
  
  if (typeof range2 !== 'object') {
    if (range2.indexOf(":") !== -1) {
      aRange2 = ss.getRange(range2).getValues();
    } else {
      aRange2 = ss.getRangeByName(range2).getValues();
    }
  } else {
    aRange2 = range2;
  }
  
  // make similar structured temp arrays with NULL elements
  var tArray1 = MakeArray(aRange1[0].length);
  var tArray2 = MakeArray(aRange2[0].length);
  var lenRange1 = aRange1.length;
  var lenRange2 = aRange2.length;
  hashtable = getHT(aRange1, lenRange1, primaryKey);
  for(i = 0; i < lenRange2; i++)  {
    index = hashtable.indexOf(aRange2[i][foreignKey]);
    if (index !== -1) {
      aDenorm.push(aRange1[index].concat(aRange2[i]));
    }
  }
  // add left and full no matches
  if (joinType == "LEFT" || joinType == "FULL") {
    for(i = 0; i < lenRange1; i++)  {
      index = aDenorm.indexOf(aRange1[i][primaryKey]);
      //index = aScan(aDenorm, aRange1[i][primaryKey], primaryKey)
      if (index == -1) {
        aDenorm.push(aRange1[i].concat(tArray2));
      }
    }
  }
  // add right and full no matches
  if (joinType == "RIGHT" || joinType == "FULL") {
    for(i = 0; i < lenRange2; i++)  {
      index = ASCAN(aDenorm, aRange2[i][foreignKey], primaryKey)
      if (index == -1) {
        aDenorm.push(tArray1.concat(aRange2[i]));
      }
    }
  }
    return aDenorm;
}

function getHT(aRange, lenRange, key){
var aHashtable = [];
var i = 0;
for (i=0; i < lenRange; i++ ) {
  //aHashtable.push([aRange[i][key], i]);
  aHashtable.push(aRange[i][key]);
  }
return aHashtable;
}

function MakeArray(length) {
  var i = 0;
  var retArray = [];
  for (i=0; i < length; i++) {
    retArray.push("");
  }
  return retArray;
}

function DefaultTo(valueToCheck, valueToDefault) {
return typeof valueToCheck === "undefined" ? valueToDefault : valueToCheck;
}

/**
 *
 * Search a multi-dimensional array for a value and return either the index or value if found, -1 or an empty sting otherwise
 * @constructor
 * @param {array} aValues - the array to scan
 * @param {string} searchVal - the value to look for
 * @param {number} searchCol - the array column to search
 * @param {number} returnCol - optional, the array column to return if specified, otherwise array index is returned
 * @returns {(number|value)} array index of value found or array value specified by returnCol
 * @customfunction
 */
function ASCAN(aValues, searchVal, searchCol, returnCol) {
var retval = typeof returnCol === "undefined" ? -1 : "";
var i = 0;
var aLen   = aValues.length;
for (i = 0; i < aLen; i++) {
    if (aValues[i][searchCol] == searchVal) {
        retval = typeof returnCol === "undefined" ? i : aValues[i][returnCol];
        break;
    }
}
return retval;
}
Sign up to request clarification or add additional context in comments.

1 Comment

My pleasure, hope this custom function is useful for you and others

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.