The Web Crawler integration allows you to extract information from websites. While the integration offers both and , we highly recommend indexing a website first before you query it: larger websites can take several minutes to index, and the live search will be limited to the front page.
The maximum depth of the website to crawl. 0 means only the root page will be queried, 1 means the root page and all pages linked from it will be queried, and so on.
A summary of the document that can be fed directly into LLMs. When retrieved from the /query endpoint, this may summarize only the sections of the document returned as highlights (ie. parts relevant to your query). Otherwise, it will summarize the entire document.
A structured representation of the website’s content. This field is only returned if the resource is returned from the /documents/get endpoint. If the website is returned from the /query endpoint, this field will be empty and the highlights field will contain the relevant sections of the website.