Start Website Crawl
Initiate a crawl to explore the URLs of a website
Starts an asynchronous website crawl operation to find all the sub URLs of the provided root URL.
You can check the status of created crawl operation using the Get Crawl Status endpoint or stop it using the Stop Crawl endpoint.
Crawling does not mean that the discovered URLs are indexed immediately. You need to manually add the discovered URLs as a data source by passing them to Create Data Source endpoint.
Crawls are rate limited to 1 concurrent operation per Guru type. Subsequent requests will fail if a crawl is already running.
Path Parameters
The slug of the Guru type to associate the crawled content with
Body Parameters
The root URL to start crawling from. Must include http:// or https:// protocol.
Response
Unique identifier for the crawl operation
The root URL to be crawled. The crawler will start from this URL and follow all links that begin with it. For example, if the URL is https://example.com/a/b/c, the crawler will extract all links that start with https://example.com/a/b/c.
Current status of the crawl operation
The Guru type that the crawl was initiated for
List of URLs discovered during crawling
Timestamp when crawl started (ISO 8601 format)
Timestamp when crawl ended (ISO 8601 format)