I set up a self-hosted Firecrawl instance and I want to crawl my internal intranet site (e.g. https://intranet.xxx.gov.tr/).
I can access the site directly both from the host machine and from inside the container using curl:
# host
curl -v https://intranet.xxx.gov.tr/
# inside container
curl -v https://intranet.xxx.gov.tr/
Both return the page content successfully.
However, when I make a request to the Firecrawl API, I get an error:
curl -X POST http://localhost:3002/v0/crawl \
-H 'Content-Type: application/json' \
-d '{
"url": "https://intranet.xxx.gov.tr/"
}'
Firecrawl log output:
Connection violated security rules.
SCRAPE_ALL_ENGINES_FAILED
All scraping engines failed! -- Double check the URL...
The Playwright engine doesn’t work either, since the /html endpoint returns 404. The Fetch engine fails with the security rules violation error.
My questions:
- Why does Firecrawl block access to intranet domains (like
*.intranet.*)? - How can I bypass this
safeFetchsecurity rule, or whitelist my intranet domain?
Notes:
- Firecrawl is running self-hosted in Docker.
- The intranet domain is accessible fine from both the host and inside the container.