1

We store our logs in Blob Container and we create individual JSON file for each action

for example 12345.json

{"User\":\"User1\",\"Location\":\"LOC\","timestamp":"2023-01-10T10:34:43.5470187+00:00","Id":"12345"}

I want to return all the data that User = User1.

I can use BlobServiceClient to connect to Blog storage account and retrieve all the json files. I would assume I can read individual json file and do some filtering, but are there any better ways to do this?

My ultimate goal is to create an endpoint and accept list of keywords, date range and then return the corresponding results.

6
  • Maybe naming the file with the unique identifier or the user and filtering the existing files by that prefix? Example: "userABC_DayMonthYearTime_LogLevel.json". So you could filter by anything that starts with "userABC%" Commented Jan 11, 2023 at 17:20
  • @mrbitzilla Thanks, yes but my ultimate goal is to create an endpoint and accept list of keywords and date range. I will add it to the post to make it more clear. Commented Jan 11, 2023 at 17:28
  • Remember that blob storage isn't a database engine. Might want to consider using a proper database to store all your searchable metadata, with reference links to the full content in blobs. Commented Jan 11, 2023 at 17:58
  • Sounds to be a job for Cosmos DB. Commented Jan 11, 2023 at 18:01
  • 1
    @DavidMakogon: That's true. Cosmos stores JSON documents regardless of the structure and you can query any property out of the box. IMHO sounds like a best matching candidate to me. If you use some SQL server you have to define a schema and convert the data for best query performance and other NoSql databases that don't base on JSON also need some kind of conversion. Additionally users seems to be already on Azure Blob storage, so picking Azure Cosmos should be easier then setting up Mongo, Couch, ElasticSearch, etc. Commented Jan 12, 2023 at 9:34

1 Answer 1

3

If you just want to use Blob Storage only, then the option would be to first list all blobs in the container and then search inside each of the blob using Query Blob Contents (I linked REST API documentation. Please check the equivalent method in the SDK).

Other (a much better option IMO) would be to use Azure Cognitive Search and create a Blob Indexer. Have the contents of the blob container indexed by Azure Cognitive Search and then do a search over that indexed data.

You can learn more about using Azure Cognitive Search with Blob Storage here: https://learn.microsoft.com/en-us/azure/search/search-blob-storage-integration. For working with JSON data in Blob Storage, please see this: https://learn.microsoft.com/en-us/azure/search/search-howto-index-json-blobs.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Is Blob Indexer same as blob index tags?
No. It is an entirely different thing. Indexers are part of Azure Cognitive Search service.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.