/

hasura-header-illustration

Cursor and Offset Pagination Techniques with Hasura GraphQL

In this post, we will look at different pagination techniques and types of pagination components that are typically used in an app and how you can use Hasura GraphQL queries effectively for each use case.

Offset Pagination

Offset based pagination is a popular technique where the client makes a request with parameters for limit (number of results) and offset (no of records to skip).

Let's take a simple example. Assume that you want to fetch 10 products from the list of all products, starting with the 21st one. This can be translated to the following query:

query {
  products(limit: 10, offset: 20, order_by: {name: asc}) {
    id
    name
  }
}

You will typically make the above query when the client is trying to navigate to the 3rd page of the results. This type of pagination works best if the data is mostly static and reasonably small. It gives a good experience to users for quickly navigating to specific pages.

However it has a couple of issues that needs to be addressed.

Performance

When the dataset is really large (millions of records) and you are trying to fetch the farthest of the results, it doesn't scale nicely. Consider the following query:

query {
  products(limit: 10, offset: 10000000, order_by: {name: asc}) {
    id
    name
    category
  }
}

The query is trying to fetch 10 products with an offset of 10 million, sorted by name. You might assume that it returns only 10 products and hence the query must be performant. But it isn't.

For the above query, the database still has to read till the offset (10 million), get the number of results in limit (10) and finally discard the results before the offset making it less performant than originally assumed.

However do note that this becomes a bottleneck only if you are frequently dealing with a huge offset with a dataset size in millions of records.

Missing / Duplicates

If data is frequently updated, there is a possibility for duplicate / missing items. Consider the following query:

query {
  products(limit: 10, offset: 0, order_by: {name: asc}) {
    id
    name
  }
}

This will give an initial list of 10 products sorted by their name. Assume somebody has inserted a few more products starting with "A". And now the client wants to fetch the next set of items and changes the offset to be 10.

query {
  products(limit: 10, offset: 10, order_by: {name: asc}) {
    id
    name
  }
}

But the above query will result in products being duplicate in the response since new products were added which could have potentially appeared in the first query, but it didn't because it was added after the first query was made by the client. This could lead to poor UX on the frontend with either missing or duplicate items.

Cursor / Keyset Pagination

Cursors work by returning a pointer to a specific record in the dataset. It does so with some assumptions. The cursor must be unique and sequential. This ensures that duplicate records are not fetched again, always reading data after a specific row rather than relying on the position of the record.

Common examples of cursors include id  (auto-incrementing integer/big integer), created_at timestamp which should be unique and sequential. Consider the following query with a where clause.

query {
  products(limit: 10, where: {id: {_gt: 10}}) {
    id
    name
  }
}

Here we are fetching products where the value of id is greater than 10. This will always skip the previously fetched results which would have been ids 1 to 10, ensuring no duplicate results. id is acting as the cursor column here; unique and sequential. Similarly we can apply this for timestamps provided they are always unique, since its already sequential.

Cursor based pagination is more performant than offsets because we can leverage database indexes on the columns in the where clause that are being used as cursors.

Relay spec for Pagination

Relay is a framework for building data driven React apps. Relay's support for pagination relies on certain assumptions about the GraphQL server. Relay provides a standard mechanism to slice data and paginate the results. Hasura now supports the relay spec and you can get started here. After you have Hasura running, you should be able to make a query similar to the one below.

query {
  products(first: 10, after: "cursor") {
    edges {
      cursor
      node {
        id
        name
      }
    }
    pageInfo {
      hasNextPage
    }
  }
}

Here, slicing is done using the first argument. This is similar to the limit which we had applied in the earlier queries. Pagination is done using the after argument, which will pass a cursor expecting results to be after that cursor. It also asks for hasNextPage which tells the client whether there are more results, since there is no concept of total number of pages in Cursor pagination.

Pagination Types

We have seen the two techniques (offset and cursor). Now let's see the different pagination types.

Numbered Pagination

Consider the following pagination component, the popular Google search pagination.

If you want to build the kind of numbered pagination that Google has, you have to use an offset based technique.

Now adding to the above, let's say there is also a direct link to First and Last pages of the results. How would you represent this in a query?

« First    ‹ Previous    Current    Next ›    Last »

With cursors, you never know what the last page looks like. This kind of pagination falls in offset where you fetch the count of total results and switch offsets based on what the client has clicked. For example, say you have 100 products and in each page you will display 10 of them.

In your client, you will make a query to fetch the total count and also fetch the nodes for the given condition.

query {
  products_aggregate {
    aggregate {
      count
    }
  }
  products(limit: 10, offset: 10, order_by: {name: asc}) {
    id
    name
  }
}

Now assuming the user clicked on Last », the above query would transform into:

query {
  products_aggregate {
    aggregate {
      count
    }
  }
  products(limit: 10, offset: 90, order_by: {name: asc}) {
    id
    name
  }
}

We calculate the offset to be 90 since we had the total count 100 in the previous query and we have a limit of 10 for each query.

// 100 / 10 = 10
pages = count / limit; 

// (10 * 10) - 10 = 90
offset = (limit * pages) - limit

Infinite Scroll / Load more

The landing page of Hasura Blog has an infinite scroll. It initially loads up a few blog posts (say 20) and keeps loading more on user scroll. The infinite scroll will last till all the data is loaded. This is generally better suited for cursor based pagination.

query {
  blogs(limit: 20, where: {id: {_gte: 20}}, order_by: {id: desc}) {
    id
    title
    description
    cover_image
  }
}

As the user keeps scrolling, we change the value of the cursor in the where clause. So the where clause now becomes where: {id: {_gte: 40}}.

If in case your id column is a uuid then it doesn't qualify to be a cursor since it is not sequential. There you would end up using offset based pagination or some other unique and sequential column like a timestamp column.

Limit in nested object

With Hasura, you can also limit the number of results in a nested object. We have been fetching just the products in the above queries. Now let's say we also want to fetch the images of products, but we only want the recent 5 images.

This can be done the following way:

query {
  products(limit: 10, where: {id: {_gt: 10}}) {
    id
    name
    category
    images(limit: 5, order_by: {created_at: desc}) {
      id
      url
      alt
    }
  }
}

We apply the limit argument and an order_by argument to fetch the latest 5 images for each product. Here's docs on the same.

Choosing the right approach for pagination would depend on the use case and whether the data is frequently updated or not.

Good Reads

In addition to this post, these are other good references i recommend going over.

Blog
25 Nov, 2019
Email
Subscribe to stay up-to-date on all things Hasura. One newsletter, once a month.
Loading...
v3-pattern
Accelerate development and data access with radically reduced complexity.