0

I'm automatically looking for various properties of README.md files.

import requests
from typing import Final

FILE_NOT_FOUND: Final[int] = 404
GITHUB_URL: Final[str] = "https://github.com/"

repo_name_1 = "FasterXML/jackson-databind" # <--- main_name = 2.14
repo_name_2 = "Cacti/cacti"                # <--- main_name = develop
repo_name_3 = "FFmpeg/FFmpeg"              # <--- main_name = master

main_name = "develop"
repo_name = repo_name_2

url = f"{GITHUB_URL}/{repo_name}/blob/{main_name}/README.md"
if requests.head(url).status_code != FILE_NOT_FOUND:
    print(f"README.md of {repo_name} exists")
    # do something with README.md ...

I need to somehow put a regex instead of main_name, is that doable?

EDIT:

Put in other words, can one use requests to do the equivalent of find:

find https://github.com/FasterXML/jackson-databind -name "README.md"
6
  • What do you mean by put a regex here? You are constructing a string here, not extracting parts of it. What would be this regex pattern and what it would be matched against? Commented May 9, 2022 at 14:50
  • I guess any sequence without / would be good for me - is it possible to do that? Commented May 9, 2022 at 14:54
  • There is infinitely many such sequences. Commented May 9, 2022 at 14:55
  • 1
    When you perform the requests.head() method you have to tell it what URL to go after. You can't tell it to go after any URL that takes a particular pattern. It would be like telling your web browser that you want to go to stackover[a-zA-Z0-9]low.com. What would it do? Visit 62 different websites and bring them all back in different tabs? What would that even mean? If you need to check if N combinations of a particular URL exist, you will have to ping N URLs. There's no getting around that. Commented May 9, 2022 at 14:56
  • Furthemore, this is an odd backwards usecase for regex. Regex is a pattern that defines a set of strings, but it doesn't generate that set of strings, instead it is used to test an already defined string to see if it matches the pattern. In your case you want to generate a set of strings from the pattern. Clearly folks have done this, but it's odd and feels like an anti-pattern as many regex patterns would yield an infinite set. Commented May 9, 2022 at 15:02

1 Answer 1

0

After sorting XY problem of regex being completely irrelevant: What you are looking for is a GitHub API, which offers, among others, exactly this information. If you check out response for this request: https://api.github.com/repos/FasterXML/jackson-databind you can see that in the returned JSON one of the parameters is "default_branch" containing exactly what you need.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.