Hacker News new | past | comments | ask | show | jobs | submit login
Pagination with rel=“next” and rel=“prev” (2011) (googleblog.com)
212 points by gorm on April 17, 2016 | hide | past | favorite | 90 comments



Literally one of the greatest things about the Opera browser was that you could browse an entire forum or whatever (longform article etc) with the Space key, because the browser would automatically go to the `next` page on hitting the bottom of the page. They also did some cool stuff with swiping to the next page on mobile.

Opera seemed like the only browser vendor that truly championed next/prev.

I only do it now as a habit because of Opera - and, since switching to Chrome after Opera's nadir, for accessibility, regardless of whether screenreaders respect it. But I probably wouldn't know about it if not for Opera.

It's both wistful and aspirational to use them, because this is the way browsers were supposed to work - as was generally the case with most things Opera, of course.


Which is why I love the Space Next addon for Firefox (https://addons.mozilla.org/en-US/firefox/addon/space-next/)


This addon doesn't work properly for me (at least not on the first 2 sites that I tried it on).


The feature is available in some browser extensions as well. Vimium, for instance, has that available with ]]


Thank you. I use Vimium but never knew about that. I forgot how much I missed that feature on Opera.


This one also works admirably on Chrome by my brief impression: https://github.com/kenaniah/chrome-navigation-plugin.


The demo video for Vimium at the chrome store has a fairly funny bit where he goes to HN. His reaction, "ahh, sweet." truly embodies the addiction that is HN.


I saw recently that Safari's reader view actually pieces together paginated articles into one long view. I haven't had a chance to try it myself though.


Safari's Reader view is one of the primary reasons why Safari's my default browser. I still use Chrome for development but 99% of browsing & reading, I do in Safari. Low power usage and the speed of Reader view is just unbeatable. It works so well and concatenates most of the paginated articles as well. If you couple it with Page One extension (it has a bunch of rules that automatically re-directs you to a single-page view of articles), 99% of the sites will be a single page in Reader view.


Clearly extension on Firefox also grabs the text from 'next' page(s) content into the readable view. The built-in reader in Firefox can not do that. It is a very useful feature to have.

edit: While looking for the addon at addons.mozilla, I find that it has been discontinued by Evernote: 'This add-on has been removed by its author.' https://addons.mozilla.org/en-GB/firefox/addon/clearly/

Anyone know of another addon that works similarly?


Safari's reader view is actually a modified version of readability. At one point I believe Apple even mentioned this in some about screen or article.

There are extensions/bookmarklets available for other browsers at https://www.readability.com.

Personally, I've been reading long form and paginated content using the readability Send to Kindle bookmarklet which works great.


Readability's slow. It sends request to their server. I've tried every extension and addon for Chrome and Safari... and nothing beats Safari's Reader view when it comes to speed and quality of text extraction. When you hit that Command+Shift+R, it is rendered in a fraction of a second.


It's a shame that Opera didn't choose to open source their code long ago. I can't help but think that would've changed their course drastically, and for the better. Having another Chrome clone seems like a sad way to go.


Yeah, this article reminded me about that feature. Used to use it all the time.

Fewer webpages where this would be useful now for me. Most either have infinite scroll, or forums/comments are in a complex tree-structure instead of pages.


Many forums haven't gone anywhere. By forums I mean what some refer to as bulletin boards.

Neither are blogs. Granted, Medium is making a complete mess of the semantic web, but all the more reason for them to support a semantic standard that might guide hapless readers such as myself to find their way.

For the most part, the "change" that "obviates" the standard are developers writing shitty code with all kinds of JavaScript nonsense that wreak havoc on accessibility, load, and UX. It's a little like saying right-clicking and using the back button in the browser aren't needed anymore, because dumb new frameworks are breaking them. :)

Dear lord, I just remembered Twitter's hashbang urls. To this day, they still break my Pinboard bookmarks.


Good point about making the interface and content more accessible.


At least in Microsoft Edge "Next" button on the toolbar sends you to the "next" page.


I actually posted this link because I found pagination by relnext underused while implementing fast forward (with space support) in the Vivaldi browser.

If it's not present we start, like opera did, analyzing links, which is a little messy.


I wonder why they recommend putting <link rel="..."> tags into the head, instead of marking up existing <a href> tags (which you'll have in most cases for prev/next) with rel-attributes. Actual difference in parsing? Just because <a> tags might not be on every site?


Well, for one thing, it's in the HTML standard [1].

But also consider websites such as forums that allow users to post custom HTML to a page. Although there is often some sanitisation, that often doesn't stretch to removing attributes from <a> tags. In those cases, it might be possible for a user to hijack the sequential links and make them point to some other website.

[1] https://www.w3.org/TR/html/links.html#sequential-link-types


your link says:

The next keyword may be used with link, a, and area elements

of course it is allowed to put it in <link>. And if your badly sanitized HTML (which hopefully did strip the malicious onClick handler ;)) includes an <a rel="" href="">, are we sure that parsers will prefer the <link> in the header? Or will Google and others actively ignore the <a>, despite the standard allowing it on both? That would be useful to know.

For simple bots putting it in an easy-to-discover link in <head> might be good, but a search engine and a browser both have to parse the entire page anyways...


Because semantics. The location of the "next" page is a property of the entire page, and therefore belongs in the head (similarly, so does the title of the page). It's not a property that may be present in any of the links (or other elements) of the page.

There's a whole lot of advantages that follow, but aren't quite reasons why (e.g. like a physical book, I don't need to read the page itself to figure out where the next page is), but it follows from applying the semantics of webpage (meta) data in a consistent manner.

(except for a CYOA book)


I believe it's to unambiguously tie the "next" to the page. Where the page content might have, for example, an <a> that moves images around in a carousel.


There is only one next in the sequence, but there may be multiple navigational elements that bring the user there. For example, links in both a header and a footer.


If you had to put them on your 'a' tags, it could conflict with other "rel"s. That is, how would one make a link that was both "next" and "nofollow"? (that's probably a bad example, but you get what I'm getting at)



One of their examples is a single sentence split over three pages of an article. I'm not sure if that's just a toy example, or also a jab at the way clickbaiters game the view counts.


I would guess that whoever wrote it subtly wants to say, if web servers\apps send partial responses to requests without a Range header, then something has broken - probably somewhere between our keyboards and chairs.


Ah, this might be the cause of the infuriating behavior where you search for a term, and it appears in one page in a huge forum thread, but after clicking on a search result you end up somewhere completely else in the thread.

I don't know whether Google sends you there, or whether the site redirects you from the "view all" page to the first page, but the result is pretty annoying.


Besides making the big G happy, these semantic relationships also helps libraries authors to easily traverse a series purely on the front end. Years ago, I looked at Github's pagination API [1] and discovered this approach also makes infinite paging much easier, just send me a bunch of next pointers until there aren't any more. So I decided to support this format by default.

[1]: https://developer.github.com/guides/traversing-with-paginati...

[2]: https://github.com/backbone-paginator/backbone.paginator


How do you guys implement / What's the best practice for paginating your db using sql and nosql?

I know there are a bunch of gotcha's to look out for


I feel there are essentially two approaches. You could store the entire article text as a single chunk of data. The text would be marked-up to indicate where the page breaks are. Each time a page of the article is requested, the entire article is fetched from the database, but the website software extracts only the necessary extract. This is rather inefficient in principle, although it is possible to optimise, and it does keep things relatively simple.

On the other hand, you could have an "articles" table and a "pages" table with the relationship:

  articles 1 <--> 1..* pages
Each page's text is stored as a separate row in the "pages" table, along with the article ID and a page number. When a page is requested, only the page that matches the requested article's ID and the appropriate page number. The database schema here is more complex, but is likely to be more efficient.


I'm no DB programmer, but despite the inefficiencies of the first approach (fetch entire article each time and display a different chunk) it would allow for users (or admins) to change page length after the article is written a LOT more easily than if each page was stored separately. Is this a potential concern of using the second method where each chunk is in its own field?


Assuming there is an actual user interface for composing and publishing articles, that interface doesn't have to be dictated by the way the data is stored.

Whichever method you choose, the software will need to be able to transform the data between two formats: one suitable for the user interface, and one suitable for the database.

However you designed the UI, it would be possible to use either approach for the database. For example, suppose the UI used a single textarea and had a special notation for delimiting pages. For the first approach, the contents of the textarea can simply be stored in its own field. For the second approach, the software would take the contents of the textarea, split it into its individual pages, then store each page separately.

If the UI had separate textareas for each page, then for the first approach, the software combines the contents of the textareas and puts in delimiters. For the second approach, each textarea is mapped to their own database field.

I suppose some of combinations are easier for the programmer than others, but so long as the UI is designed to be easy to use, the user need never know how the article is actually represented in the database.


http://use-the-index-luke.com/blog/2013-07/pagination-done-t...

Essentially, once your rows are ordered, your next page is "rows > the last row on the previous page, limit <number of items to return>"

This has the fantastic benefit of being index-friendly, which OFFSET is not, and if you structure the URLs for the pages well, obvious what you're getting back. (e.g., if your records are by date, you might get /daily-reports?since=2015-01-04&limit=50 for the page that starts at 2015-01-04)

It's from an SQL site, but the concept is not specific to SQL. I'm working on an implementation of it for data stored in Mongo, at the moment.


It's really easy to paginate data with SQL - just use LIMIT and OFFSET.

  limit = page_size
  offset = (current_page - 1) * limit
For example, let's say you have a table with 1000 products in it, and you want to display page 3 with a page size of 50. Here's how you would write it:

  SELECT *
  FROM products
  LIMIT 50
  OFFSET 100


The downside is that OFFSET is slow if you have a lot of items, like on a forum.

At which point you could assign each post in a topic an indexed `position` column that always increases from 0 within that topic, which lets you jump to arbitrary pages and parameterize perPage.

    fromPos = (page - 1) * perPage
    
    SELECT *
    FROM posts
    WHERE topic_id = $1
      AND position >= fromPos
      AND position < fromPos + perPage
    ORDER BY id
For simpler needs, your "Next" button could load items that come after the last ID on the current page but with a LIMIT.

    <ul>
      {% for item in items %}
        <li>{{ item }}</li>
      {% endfor %}
    </ul>

    <a href="/items?afterId={{ items[items.length - 1].id }}">Next</a>

    SELECT *
    FROM items
    WHERE id > $afterId
    LIMIT $perPage
That's fast and really easy.


This sounds like a really premature optimization to me. Let the DB do it's job unless you have an explain trace showing it's slow for your workload. Trying to do things like sequence columns adds complexity, when this is often handled by the engine.


Or they could be speaking from experience, because back in the day when webforums where the place to make online communities, as they grew beyond a certain size this sort of thing became THE bottleneck, especially for large threads.

I empirically know this was an issue back then. I also agree it's something a DB engine should handle, and I do hope that databases do a better job today.

So instead of calling out "premature optimization", maybe instead come with a practical example of a DB engine that properly handles this?


OFFSET is not "handled by the engine". It heapsorts the entire dataset before you offset to a page.

The top-level comment is asking for good ways to paginate. OFFSET works on localhost and with hundreds of rows. My solution is one that works in production once OFFSET fails you. For me, it was day 2.


It was day 2 for you, with the specific engine you used.

You don't need to heapsort the entire dataset before you offset to a page, and there are engines capable of doing this.


You need this if it's for a feed that gets frequent updates. Imagine a paginated Twitter feed, and the user clicks a link with ?page=3, and suddenly there's a new tweet, causing page 3 to have a tweet the user saw on page 2. Using afterId solves that.


I just wrote about doing it this way yesterday.

http://www.mozartreina.com/


You should at least reference the fact that with MySQL, the server has to step through OFFSET result rows before it can start streaming you data. This can get surprisingly expensive, surprisingly quickly. See http://stackoverflow.com/questions/4481388/why-does-mysql-hi...


And if the data frequently changes during pagination?


That's why it's often better to use a cursor in your client-facing API (or a pseudo-cursor that maps to something you have in your database), rather than passing the limit/offset through.


As fbonetti already said, LIMIT is the canonical way to do it with SQL databases (I would just add to their comment that you should ensure a positive page_size and offset, because "LIMIT -5" is invalid in at least MySQL, and to limit the page size to some sensible values).

For NoSQL, specifically (only?) CouchDB, you start iterating from a given key (in SQL, this would effectively be "WHERE id >= $start ORDER BY id ASC LIMIT n").

This difference is why I would recommend, especially for paginated REST APIs, to provide links to the next/prev pages, so API clients can follow these links without thinking about how pagination is accomplished exactly. To prevent people from guessing the inner workings and working around your links, you can encode the parameters in a "cursor" (so instead of ?pagesize=20&page=10 you could have ?cursor=base64encode("20,10") (SQL) or ?cursor=base64encode("$startKey") (NoSQL)). IIRC Facebook does it this way. Encrypting or authenticating the cursor is IMHO overblown here, given cursor values can be easily validated and rejected if fiddled with.

Using an opaque cursor also gives you the ability to change how pagination works without breaking existing API clients.

If you plan for lots of data and many pages, think twice before outputting links to all pages (on a website) or outputting the total number of elements (in an API). COUNT(*) on InnoDB is slow and I heard it's not the quickest thing in NoSQL databases either.


How do you keep the cursor / know where you left off with the database? For example, I don't want to repeat what I saw in page 1 in page 2 because someone else just added more articles.


Use continuations.



One great approach is cursors. Basically, you keep track of the value of the sort function (e.g., relevance) of the last record on the page, and add that to your filter (e.g., WHERE relevance > $LAST_RELEVANCE) to get the next page.

This is very efficient when your table is indexed by that field; you can seek directly to the next record.

Some database APIs do this for you!

I wrote up my experience implementing this on App Engine here, http://johntantalo.com/blog/paginating-with-bookmarks-in-app...


Why would you break content into "pages"? If the requestor wants the data, give it to them

Forcing artificial page-breaks is abusive.


So HN should be a single page with all submissions ever? ;)

Yes, don't unnecessarily break up content. But most pages are going to include lists of some sort that you want to be paginated in some way.

If you look at HNs /new queue, there are submission IDs in the next links, so you can click "more" multiple times without seeing content multiple times, even if new submissions have pushed them down in the global list by now. Useful if there are many updates and new entries come in at the "top" of the list.


So if someone requires the entire table you think a single API call should just send it? Does it not occur to you that people used page data to limit response sizes and manage the performance of their API's?


What are the recommendations for infinite scroll?


Don't use it. It's an antipattern and an awful meme.


> Don't use it. It's an antipattern and an awful meme.

I sometimes hate it (e.g. in facebook because it bluntly breaks the scrollbar), but at other times I miss it (e.g. in gmail).


Even Twitter, for instance, with their hundreds of engineers can't get it right. Login, scroll, read, click something, maybe it navs, maybe it opens a modal. Sooner or later I end up reloading the page and completely losing my place. I dont bother anymore.


Sometimes doing _any_ sort of action will jump to the top of the feed. It doesn't happen as often on web as it does in their app but occasionally it will.


Yet the somehow got it right on their iOS app (that top left back button actually works as you'd expect).

Amazing how nice it is on the phone, but how broken it is (because of modal inconsistencies, the way you'll eventually have to press your browser back button) on the web version.


Open in new tab


How about something like photos.google.com? They have infinite(well, all history) scroll, but it is done perfectly. The scrollbar size doesn't change and you are given enough feedback to scroll directly to the part you want.

Similar is google maps. If you make it a large canvas where the missing data is paged in then it becomes quite natural.


Why's that?


OTOH:

- usually breaks the back button (e.g. scroll to the 20th page, click something, go back. welcome to page 1!)

- usually breaks links (how do you show someone things-on-page-10?) (fixable by tweaking the URL as you go, but nothing is really consistent + predictable by non-technical users)

- usually performs hideously on low-power machines (e.g. mobile) due to memory growth (there are techniques, but few use them). can even tank powerful machines in time.

- footers. headers. etc.


> usually breaks the back button (e.g. scroll to the 20th page, click something, go back. welcome to page 1!)

This is easily overcome by pushState or what have you. As you scroll, the URL in the address bar changes to like `/page/20/` and then going back takes you to that URL where only the results from page 20 are displayed.


This works fairly well in a lot of cases, and I definitely prefer it over page 1. But if you do that, you can't scroll up to see page 19 - you didn't really go back, you were just brought to a one-time page to see that thing you looked at most recently in a massive list. What if you were comparing things on page 17 and 20?

Another approach is with "virtual scrolling", by preserving space for the items above and loading whatever you scroll to - I've seen that once (I forget where) and it was really nice. They didn't tackle the linkability problem tho, and had other issues.

But all of this is fairly complex, and very few sites (or apps!) actually do so. Which is part of the problem.


It is extremely rare that it doesn't bring my browser to its knees after a certain amount of scrolling. (This is with 12 GB on Chrome.)

You will also end up losing your reading position after a browser restart or on iOS where Mobile Safari reloads a page to save memory.

It's a dumb, pointless feature that at the very least shouldn't be the default option. I'm sure there are hypothetical scenarios that may or may not warrant it, but it is a pain and a half for the user. Tumblr blogs have it in spades, and it's a miserable experience to browse.


I don't understand why more sites don't just use pagination with larger result sets. Return pages with 100 or 200 results each. It takes a little more bandwidth, but it's a good middle ground between pagination and infinite scroll.


Not to mention that people still use that one terrible loading-animation icon that looks like crap, because they didn't configure the CSS properties correctly.[^1]

[^1]: https://jsfiddle.net/76FSM/17/


There are probably things the web developer can do to minimise memory consumption, though there isn't an obvious solution.

That said, the problem of losing your reading position can be solved using something like history.replaceState(). As the user scrolls, update the history entry with a reference to the current reading location. Then, when the browser restarts, the website can know where the user was previously at and can return them to that location. This is the approach Discourse uses [1]. Alternatively, I suppose cookies could be used, but that's a slightly messier solution, imo.

Unfortunately, I haven't seen many websites adopt such a feature. Which is a shame because infinite scroll can make for a great user experience -- if done right. But too many websites actually suffer as a result of poorly implemented infinite scroll.

[1] https://eviltrout.com/2013/02/16/infinite-scrolling-that-wor...


> Alternatively, I suppose cookies could be used, but that's a slightly messier solution, imo. Cookies are shared between tabs. You should use URL or builtin browser capabilities (remembering scroll position on a page) for this purpose.

Infinite scroll has other UX problems. For example, you cannot see the footer of a page or cannot navigate to the items by page number. As there is no page numbers you cannot even remember how to find some item.


Not necessarily. Though it's true that many implementations have these problems, they're not inherent to infinite scroll.

Regarding footers, who's to say the footer needs to go beneath the infinite scroll area? You could instead place the contents of the footer in a sidebar that stays in a fixed position. Alternatively the footer could be overlayed at the bottom of the page so it is always visible (a recent design trend would have it hide when the user scrolls down, but reappear when the user scrolls up).

As for page numbers, there's nothing stopping you from adding them. Each item should probably have some sort of permalink to allow you access it directly, and there should be suitable navigation to allow you to find individual items.

I find it quite sad that infinite scroll gets a bad rap due to poorly designed implementations. Perhaps it will get a better reputation if more websites actually put some thought into how infinite scroll is used.


Pages become so complex that browsers start lagging. If you click a link on the infinitely scrolling page and then come back, your position is lost and you have to scroll down the whole way again (see Facebook, Twitter).


I can't stand trying to click a footer link but it keeps loading more content. If you use infinite scroll, don't put anything underneath it.


Infinite scroll breaks text search. :-(


I love inifinite scrolling in Pinterest but hate it in an ecommerce site with more than 200 items on a page.


Infinite scroll is fine on some websites, like Reddit, Image galleries or Medium.


From the same Google Webmasters blog (2014): https://webmasters.googleblog.com/2014/02/infinite-scroll-se...


Like everyone else said, avoid using infinite scrolling, because it's not suitable for all types of websites. But if you still want to use it, then add a "load more" button just before footer. So, if someone want to check out footer information, he doesn't have to face the annoying infinite scrolling.


Can anybody point to an implementation of infinite scroll that actually works? I know facebook's implementation doesn't work correctly because the scrollbar breaks when scrolling down using it.


Seems that we can learn lessons from comments here, and hopefully some can point to a decent implementation, or build one.

Salient points seem to be:

# Update URL during scrolling so links don't break, and link to the "deep" item not just the first page.

# Don't break history, so back/forward/reload buttons act as would be expected.

# Truly endless scrolling kills browsers due to consuming increasing resources, so don't do it. Seems that a compromise might be: larger page sizes that lazy-load content up to a sensible max length. For example, in a data set of 1,000 results perhaps display the first 10 immediately and lazy-load up to 50 during scrolling. Then normal paging occurs. (Max items per page could be user-selected, as is currently seen sometimes.) Items that have scrolled off the top of the page could be removed from the DOM too, and lazily reloaded on demand.

# Endless scrolling is not always appropriate, but there may be appropriate use cases. It would be good to identify these cases, perhaps through user testing.

# Obviously, it shouldn't break screen readers or text-only browsers etc. These should be handled gracefully.... which raises the question of how to effectively navigate through a large list, possibly of unknowable length, in these browsers.

I feel sure I've seen various of these, but not all together in one implementation. Anyone seen somewhere that does all this well?

Any further suggestions/criticisms of this pattern?


I've seen implementations on other systems where this has been solved by placing a vertical bar to the right of the window. The height of this bar represents the total length of the document. Inside this vertical bar is a small bar which represents the current viewing window on the document. The user can "scroll" this small bar to move the current viewport over the large document.

I think I might start overriding the existing methods which users are used to with this method because I've build websites for a couple of years and think I know better than years of UI development by much more experienced people than I.


Sounds familiar. I recall instances where the small vertical bar-within-a-bar you mention becomes vanishingly small, and the overall page size unmanageable, making "scrolling" a joke.

It would be nice to know if we're making progress on problems like this. Paging and scrolling both evolved to solve problems. Perhaps they're imperfect, or could stand improvement. Perhaps not. Regardless, an open-minded discussion would probably shed some light on current thinking.


This is a blog post from 2011. The current HN submission title does not indicate this article's age.


I agree, but Google still seems to recommend this usage.


The iPod Notes format used this type of pagination! It was essential back then because each note could only have a maximum of 1000 characters.


One of the nice things this does is give the browser's "reader mode" a hint where the next page is. And it's great. Only problem is, publishers seem to have no interest in helping the reader mode. On the contrary, many seem hell-bent on defeating its content detection algorithm, etc, to make it useless so we all look at their stupid ads.


Also known as a good way to break many a SPA.


Single page application? Can you explain why the existence of next/prev relationships would break these? I would expect such a site to simply not use this. It's definitely nice for linearly-linked content on supporting browsers (e.g. old opera, as mentioned elwewhere in this thread).


Well, look at it this way, when creating a SPA, you're running everything in a single place. A single HTML page. In some instances, it makes sense to push the state to the query string, but often it doesn't, and more often still are the occasions developers haven't thought about it. So every time you hit the back button, and weren't where you thought you would be, that's the same issue. SPA frameworks's are getting better at that, but this still happens to me on the daily.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: