I am working on a project where I need to extract specific information from URLs, specifically the environment, domain, and hostname. The URLs have variable subdomains, and I'm having difficulty constructing a regex pattern to capture the required groups.
Link: https://regex101.com/r/4DhLns/3
I need help crafting a regex pattern that can efficiently capture the following groups:
- Group 1: environment (e.g., stage, qa)
- Group 2: hostname (e.g., hostname)
- Group 3: domain (e.g., com)
const regex = /.*(?<environment>(qa|stage*)).*\.(?<hostname>\w+)*\.(?<domain>\w+)$/;
function extractInfoFromURL(url) {
const match = url.match(regex);
if (match) {
return match.groups;
} else {
return null; // URL didn't match the pattern
}
}
const testUrls = [
"https://example.test.qa.sub.hostname.com",
"https://example.test.stage.coonect.hostname.com",
"https://example.qa.hostname.com",
"https://example.hostname.com",
"https://example.stage.hostname.com",
"https://ops-cert-stage-beta.apps.sub-test.minor.qa.test.sub.hostname.com",
"https://ops-cert-qa-beta.apps.sub-test.minor.qa.test.sub.hostname.com",
"https://ops-cert-qa.apps.sub-test.minor.qa.test.sub.hostname.com",
"https://ops-cert-stage.apps.sub-test.minor.qa.test.sub.hostname.com"
];
testUrls.forEach((url, index) => {
const result = extractInfoFromURL(url);
if (result) {
console.log(`Result for URL ${index + 1}:`, result);
} else {
console.log(`URL ${url} did not match the pattern.`);
}
});
Here, the issue is with: https://example.hostname.com, env should be null here and the domain and host should be present.
RexEx101: https://regex101.com/r/aCCWRv/2

https://ops-cert-stage-beta.apps.sub-test.minor.qa.test.sub.hostname.comgive, "test" or "qa"? Something like this?\bhttps?:\/\/(?<host>\w+(?:-\w+)*).*\.(?<env>qa|stage|dev|preprod|test).*\.(?<domain>\w+)\.[a-z]{2}regex101.com/r/iOk7rH/1