Get URL Data

Using the correct query

There are two ways to retrieve raw URL (/link/sitemap/etc) data from Lumar:

This page describes how to retrieve defined metrics for URLs in the crawl. This query can be filtered, sorted, etc. but requires you to paginate URLs 100 at a time. This is perfect for getting a sample of the available data, but is not well suited to getting all data for a crawl.
The Download Raw Data query allows you to download all data from a datasource in a single request, however this cannot be filtered or sorted. This is the most efficient way to access all data.

Using the `getReportStats` query to access Crawl URL data

The sample query below will return 5 properties (fetchTime, pageTitle, responsive, url, wordCount) from the crawled URL but hundreds are available - for the comprehensive list, inspect type CrawlUrl.

Query
Response
cURL

query GetUrlData($crawlId: ObjectID!) {
  getReportStat(input: { crawlId: $crawlId, reportTemplateCode: "all_pages" }) {
    crawlUrls(reportType: Basic, first: 3) {
      nodes {
        fetchTime
        pageTitle
        responsive
        url
        wordCount
      }
      totalCount
    }
  }
}

{
    "data": {
      "getReportStats": [
        {
          "crawlUrls": {
            "nodes": [
              {
                "fetchTime": 0.38,
                "pageTitle": "FAQ - Lumar",
                "responsive": true,
                "url": "https://www.lumar.io/faq/",
                "wordCount": 4055
              },
              {
                "fetchTime": 0.03,
                "pageTitle": "About - Lumar",
                "responsive": true,
                "url": "https://www.lumar.io/about",
                "wordCount": 1074
              },
              {
                "fetchTime": 0.04,
                "pageTitle": "Blog - Lumar",
                "responsive": true,
                "url": "https://www.lumar.io/blog/",
                "wordCount": 605
              }
            ],
            "totalCount": 2186
          }
        }
      ]
    }
  }

curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetUrlData($crawlId: ObjectID!) { getReportStat(input: { crawlId: $crawlId, reportTemplateCode: \"all_pages\" }) { crawlUrls(reportType: Basic, first: 3) { nodes { fetchTime pageTitle responsive url wordCount } totalCount } } }"}' https://api.lumar.io/graphql

Try in explorer

Using the correct query​

Using the getReportStats query to access Crawl URL data​

Using the correct query

Using the `getReportStats` query to access Crawl URL data