Get URL Data
Using the correct query​
There are two ways to retrieve raw URL (/link/sitemap/etc) data from Lumar:
- This page describes how to retrieve defined metrics for URLs in the crawl. This query can be filtered, sorted, etc. but requires you to paginate URLs 100 at a time. This is perfect for getting a sample of the available data, but is not well suited to getting all data for a crawl.
- The Download Raw Data query allows you to download all data from a datasource in a single request, however this cannot be filtered or sorted. This is the most efficient way to access all data.
Using the getCrawl
query to access Crawl URL data​
The sample query below will return 5 properties (fetchTime
, pageTitle
, responsive
, url
, wordCount
) from the crawled URL but hundreds are available - for the comprehensive list, inspect type CrawlUrl
.
- Query
- Response
- cURL
query {
getCrawl(id: 1612640) {
reports(
first: 1
filter: {
datasourceCode: { eq: "crawl_urls" }
reportTypeCode: { eq: "basic" }
reportTemplateCode: { eq: "all_pages" }
segmentId: { isNull: true }
}
orderBy: [{ field: reportTemplateCode, direction: ASC }]
) {
nodes {
crawlUrls(first: 3) {
nodes {
fetchTime
pageTitle
responsive
url
wordCount
}
totalCount
}
}
}
}
}
{
"data": {
"getCrawl": {
"reports": {
"nodes": [
{
"crawlUrls": {
"nodes": [
{
"fetchTime": 0.38,
"pageTitle": "FAQ - Lumar",
"responsive": true,
"url": "https://www.lumar.io/faq/",
"wordCount": 4055
},
{
"fetchTime": 0.03,
"pageTitle": "About - Lumar",
"responsive": true,
"url": "https://www.lumar.io/about",
"wordCount": 1074
},
{
"fetchTime": 0.04,
"pageTitle": "Blog - Lumar",
"responsive": true,
"url": "https://www.lumar.io/blog/",
"wordCount": 605
}
],
"totalCount": 7717
}
}
]
}
}
}
}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query { getCrawl(id: 1612640) { reports( first: 1 filter: { datasourceCode: { eq: \"crawl_urls\" } reportTypeCode: { eq: \"basic\" } reportTemplateCode: { eq: \"all_pages\" } segmentId: { isNull: true } } orderBy: [{ field: reportTemplateCode, direction: ASC }] ) { nodes { crawlUrls(first: 3) { nodes { fetchTime pageTitle responsive url wordCount } totalCount } } } } }"}' https://api.lumar.io/graphql