2

So, I have a Node.js Lambda that saves its response in an AWS Elasticache Valkey cache. All the keys follow the same format: getActivities:*. I'd like to clear all the keys matching this pattern.

I tried to clear the cache using Node.js, but I encountered an error: CROSSSLOT Keys in request don't hash to the same slot.

I don't think running this on Node.js is a good idea. How can I clear my cache using AWS.

2 Answers 2

1

The author's answer will work, but it is a big no no.

I will use Valkey over the answer, but the same is true for Redis, and Glide can be used for Redis-OSS version as well (maybe also no OSS versions, but we don't follow the code base changes).

To the hashtag idea:
When you hashtag, your keys will be distributed to a specific shard.
If it is a small amount compared to your cluster node's size, that's fine, and in some cases this is the best practice for handling things.

However, this not recommended for general usage.
The idea of clustering is to distribute your keys and network between different shards. Sharding works base on a very solid hash algorithm. If you manually distribute keys into one specific shard, you are creating an unbalance cluster.
It is ok to use hashtag, but they should be used carefully when no other good options exist.
While in this case, there are plenty of better and more healthy ways to do so.

For the KEYS usage:
The command documentation: keys warning

Using KEYS will block the node until the command find the whole keys in the node.
It will hurt the performance, will create unresponsiveness, potentially leading to connection storm which lead to extreme high network which doesn't let the server get back to being responsive.

While Elasticache has ways to deal with such mistakes and misbehave, at the end of the day, it cannot force users to behave in a certain way, and if you decide to crash it, at some point you'll cross what's the system can defend from.

There's also client libraries that have fault-tolerant methods to defend your system against connection storm and better deal with crashes or blocked nodes, and it's recommended to use them.
At that point, valkey-glide proves to be the best for fault tolerance since it was designed base on years of customers' pains and issues, see valkey-glide. But I'm a glide maintainer so don't trust me on that, take it for a ride, you can crash some do some manual failovers to see if I'm being real.
But also in the case of using the best client (valkey-glide), and Elasticache defends mechanism, machines have limitations, if you cross the limitation of the machine by decision, it will crash.

In some cases, KEYS by itself may crash your node by crossing the max CPU of the node. It is not production command!

So what yes:

Method 1 - using client library:
Use a client library with cluster connection support.
Again, recommend valkey-glide.
The bellow explanation is using valkey-glide, but you can implement similar, mechanism using other clients as well. The code sample will be TS, but Glide, at the time of writing those words supports python, Java, and Node.js, Go is going to public preview in about 2 weeks, and planned for GA in two months. Ruby and CPP are under development, and C# is planned for around Aug 25. But don't catch me on the exact date.

Glide has a Cluster Scan feature, which gives you the guarantees of SCAN.
While writing this comment, OOTB Cluster Scan is under development by Valkey, and is not implemented by Redis, as far as I know.
Other clients library also offers cluster scan feature, but base on research, couldn't find one with a mechanism that guarantees you to get the whole keys (I implemented Cluster Scan in glide, and did a vast research to find if there's already good solutions, but maybe there is).

Using Cluster Scan lets you iterate over the keys in your cluster, based on pattern as well, you can further improve the scan by limiting it to the data type you are looking for if you have a specific one.

For each iteration of the Scan, you just pass the results to DEL, until you finish the iteration. You can pass higher COUNT parameter if you want to do so faster, but it is very fast anyway, a matter of one or two digits ms.

You don't need to hashtag, since by using the pattern itself, and cluster scan, the client will iterate by itself on all the clusters.

A note specifically for lambda, or any write-only systems. If it's not your case, just skip to the code sample
If you are using it with labmda, since it is a write-only system, you need to turn off the logger before, since by default Glide set up a logger to WARN and logs to a file.
You can change it to usual logs to std, and a planned feature is integration with cloudwatch and similar. For write-only systems which have an environment var which you can pass, if you still want a log file, you can set XDG_RUNTIME_DIR env variable to /dev/shm.

The code sample:

import { GlideClusterClient, ClusterScanCursor, GlideString, Logger } from "@valkey/valkey-glide";

Logger.init("off"); // For canceling logging on write-only file systems

// Creating the cluster client, you can pass one or more node addresses. Anyway Glide will populate the cluster and discover them.
const client = await GlideClusterClient.createClient({
  addresses: [
    {
      host: "myclustername.xxxxxx.cfg.usw2.cache.amazonaws.com" // In case of AWS ElastiCache, use the configuring endpoint of the cluster
      , port: 6379 // Replace with the port of the cluster
    }
  ],
  // useTLS: true if using TLS
  // requestTimeout: Number Timeout in milliseconds - recommended to be set to a value that fit the use case
});

let cursor = new ClusterScanCursor();

let keys: GlideString[] = [];

// The scan will iterate over the whole cluster, meaning - you don't need to use hashtag to one specific shard, provide a pattern to the match parmeter.
while (!cursor.isFinished()) {
  [cursor, keys] = await client.scan(cursor, {
    match: "getActivities:*", count: 10
  });
  await client.del(keys);
}

Using valkey-cli or "connect to cache" of Elasticache:
The best solution i was able to find is to connect to the valkey-cli in "cluster mode cli", with the "connect" of Elasticache it is done automatically, with local valkey-cli add -c to the connecting cli command:
Short explanation of what happens:
CLUSTER NODES to retrieve all nodes-id's,
GET {master-id} will move you to the master you want. Do so master after master so you clean all the cluster.
The EVAL is a "valkey script" to retrieve results from the node using SCAN and deletes the return values (I wrote this one, you can write your own if you want for any reason).
EVAL script could be written to loop all over the keys without you iterate, it's just like any script, but doing so you'll end up blocking the server like with keys. So work a bit harder and iterate manually.
Each iteration, you replace the cursor by the returned one, until it's returning zero again, then you return the process over the next master.

# Run cluster node to get all ids
CLUSTER NODES
3eff73b9e0e46d8c7970d9dc5e9fa79e1173b02d clustername-0003-002.clustername.***.use2.cache.amazonaws.com: 6379@1122 slave a2c285fec7283b56d54f13e11195dcab8ebb16c3 0 1737819161000 0 connected
1953e97956ea213c3d2782c6a438d961429ac994 clustername-0002-001.clustername.***.use2.cache.amazonaws.com: 6379@1122 master - 0 1737819160000 3 connected 5462 - 10922
dd003f7bed47b73c4eb1375842160d4d7c39c766 clustername-0001-002.clustername.***.use2.cache.amazonaws.com: 6379@1122 slave 5159451a2251b9b0fc61554a7a2ccae36969d9e8 0 1737819160783 1 connected
9658871c7c6409d0b411cce702e316410f97eaab clustername-0001-003.clustername.***.use2.cache.amazonaws.com: 6379@1122 slave 5159451a2251b9b0fc61554a7a2ccae36969d9e8 0 1737819161786 1 connected
76a775f6cd1653d71ca25c4cb84190eb32921455 clustername-0002-003.clustername.***.use2.cache.amazonaws.com: 6379@1122 slave 1953e97956ea213c3d2782c6a438d961429ac994 0 1737819160000 3 connected
a2c285fec7283b56d54f13e11195dcab8ebb16c3 clustername-0003-001.clustername.***.use2.cache.amazonaws.com: 6379@1122 myself, master - 0 0 0 connected 10923 - 16383
5159451a2251b9b0fc61554a7a2ccae36969d9e8 clustername-0001-001.clustername.***.use2.cache.amazonaws.com: 6379@1122 master - 0 1737819162789 1 connected 0 - 5461
02dce0129f2cc72823301071ebe2c3189a80fe8f clustername-0003-003.clustername.***.use2.cache.amazonaws.com: 6379@1122 slave a2c285fec7283b56d54f13e11195dcab8ebb16c3 0 1737819159000 0 connected
3678acf328006bfbaf95285c86e81d592f78bdfc clustername-0002-002.clustername.***.use2.cache.amazonaws.com: 6379@1122 slave 1953e97956ea213c3d2782c6a438d961429ac994 0 1737819161000 3 connected

# For each of the master, use the master id to switch the connection and do the bellow
GET 1953e97956ea213c3d2782c6a438d961429ac994 # The id of one of the master returned from cluster nodes command, will switch to the specific master
EVAL "local cursor = tonumber(ARGV[1]); local pattern = ARGV[2]; local count = tonumber(ARGV[3]); local keys = redis.call('SCAN', cursor, 'MATCH', pattern, 'COUNT', count); cursor = tonumber(keys[1]); for i, key in ipairs(keys[2]) do redis.call('DEL', key) end; return cursor;" 0 0 "pattern" 100 
# Change the "0 0 "pattern" 100" at the end, the seconed 0 change for each iteration to the cursor return, the pattern to the pattern you want, the 100 to the count you want.
(integer) 13

Repeat until you get (integer) 0 then move to the next master.

Sign up to request clarification or add additional context in comments.

6 Comments

Thank you! About my answer: my cache isn't (yet) in a production environment. So I certainly didn't encounter a max CPU usage. But I see your point. I'm still a beginner, and I saw people using hashtags in multiple posts. I appreciate you took the time to write a better answer. But something is still unclear: if I use your Scan method, I don't need to use hashtags at all? That's why you didn't provide an alternative? They are simply not needed if I'm using the correct method to find and delete keys? Thanks in advance for clarifying that in your answer, I'll gladly accept it. 👍
Yes, you absolutely don't need the hashtags. My offer, which preferably is with the code, will scan the whole nodes in the cluster by itself, the second one as you see in the comments inside the code, you manually move between the nodes (masters only) and delete them. I provided an Eval so you don't need to catch and delete, the EVAL does it by itself. Meaning, I created a "Valkey Script" which will delete base on the returned key. I will edit and clarify it. As a Glide maintainer and Elasticache team member, you can reach out to me for any help needed. I'll be more than happy to help.
Edited the answer to make it more clear. For contacting for specific help needed, you can open an issue in Glide repo, or contact me using my personal info you can get from my github profile avifenesh.
If possible, leave your answer so its clear to what I'm referring in my answer, but add a note that it shouldn't be used in production.
Thank you for being so helpful. I'll try your script next time I have to delete some keys. In the meantime, I'm just glad you saved me from implementing an anti-pattern 👍
|
1

⚠️ Wrong answer

I'm leaving this answer up for context. But please look at @avifen's answer instead. ⬆

Use Hashtags

You (I) should use hashtags in your keys so that you can easily perform delete operations on multiple keys. Hashtags will make them "related", which make a DEL operation possible on clustered Redis.

Basically what you need to do is to put the part of your key that is common to all related keys between curly braces. That will create the hashtags. Deleting these keys using pattern matching should work now.

Example: {getActivities}:restofthekey*

For more information, read this.

How to delete keys in AWS Elasticache

  1. Log into AWS Elasticache console
  2. Select your valkey cache
  3. Scroll down and select the Connectivity and Security tab
  4. You should see a Connect to your Cache option. This will open up a terminal preconfigured to be connected to the valkey-cli of your cache.

The connect to your cache option

  1. If your request use hashtags, all you need to do is to run the following command: DEL KEYS "{YOUR_KEY_MATCHING_PATTERN}*". That will remove the keys, congrats you can stop here.
  2. If (like me) you didn't know about hashtags, and still have keys to remove, this command will likely trigger an error (CROSSSLOT Keys in request don't hash to the same slot). If it does, start by listing the keys:
clustercfg.****-cache.***.****.cache.amazonaws.com:****> KEYS "activities:*"
 1) "activities:3006787:bicycle_rental,bicycle_parking,restaurant"
 2) "activities:2995469:bicycle_rental,bicycle_parking"
 3) "activities:2972315:bicycle_rental,bicycle_parking,restaurant"
 4) "activities:2995469:bicycle_rental,bicycle_parking,restaurant"
 5) "activities:2996944:bicycle_rental,bicycle_parking,restaurant"
 6) "activities:3037543:bicycle_rental,bicycle_parking,restaurant"
 7) "activities:2998324:bicycle_rental,bicycle_parking"
 8) "activities:2988507:bicycle_rental,bicycle_parking,restaurant"
 9) "activities:3031582:bicycle_rental,bicycle_parking,restaurant"
10) "activities:2972315:bicycle_rental,bicycle_parking"
  1. Copy these keys, open up VSCode
  2. Put the cursor after 1) , press Option+Shift and then scroll down until you've created cursors on every line (you might need to repeat the process if you have more than 9, 99, 999 lines. Delete the numbers, remove the spaces.
  3. Add a DEL behind every key.
  4. Your lines should look like this: DEL "activities:2998324:bicycle_parking,bicycle_rental"
  5. Copy everything, paste it right to the terminal, boom, all your keys will be deleted.
  6. Run KEYS "YOUR_MATCHING_PATTERN*" to ensure everything was correctly removed.

1 Comment

Your comment is very problematic, please remove, so watcher will not use it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.