
Show HN: Open source JavaScript library to record and replay the web - yz-yu
https://www.rrweb.io/
======
yz-yu
Hi, hackers, the author here.

Since I've seen some people are talking about the open source idea and
comparing rrweb to some commercial products, I'd like to share a blog post
about the vision of rrweb.

[http://www.myriptide.com/rrweb-introduction/](http://www.myriptide.com/rrweb-
introduction/)

Also, you will know about how rrweb works in this post.

~~~
no1youknowz
Thanks for open sourcing your work. I had already built something similar to
this and other commercial products, but in jQuery.

Really interested to see how this compares.

I did see the IE11 issue. Are there any thoughts on what can be implemented
for a fallback?

~~~
yz-yu
I know there is some MuationObserver(which rrweb used to observe DOM update)
API's shim library, but not sure the impact on performance.

BTW, rrweb is also a project to explore the power of modern browser, so IE
issues may not be considered in a high priority.

------
deforciant
Even though I do agree that the total recording of sessions is not nice, such
tools can be extremely important during early stage testing with your web
app's UX. Especially for solo founders who don't have colleagues to tell them
that they can't figure out how to use your app :)

Nice way would be to do some recording for a week or so. Get any sessions that
obviously were quite long and the user didn't achieve anything. Go through
them and try improving UX so the user won't get stuck there the next time.

------
marksomnian
Looks really cool, but I find myself thinking about the privacy implications
of using this, especially by default. Even if the user gives consent, it still
implies recording every single mouse movement and keystroke on the site.

Has this been normalised? Is this the new default?

Food for thought.

~~~
hjek
RMS has been writing about the issue of non-free session recording scripts in
The JavaScript Trap[0]:

> In addition to being nonfree, many of these programs are malware because
> they snoop on the user. Even nastier, some sites use services which record
> all the user's actions while looking at the page.[1] The services supposedly
> “redact” the recordings to exclude some sensitive data that the web site
> shouldn't get. But even if that works reliably, the whole purpose of these
> services is to give the web site other personal data that it shouldn't get.

[0]: [https://www.gnu.org/philosophy/javascript-
trap.html](https://www.gnu.org/philosophy/javascript-trap.html)

[1]: [https://freedom-to-tinker.com/2017/11/15/no-boundaries-
exfil...](https://freedom-to-tinker.com/2017/11/15/no-boundaries-exfiltration-
of-personal-data-by-session-replay-scripts/)

~~~
SquareWheel
I agree with the privacy concern, but calling analytics software "malware" is
too extreme. It isn't mining for bitcoins on your hardware, or encrypting your
documents to extort you.

Always using the most extreme terms just makes it easier to dismiss such views
outright.

~~~
hjek
I think it depends on _how_ the software is used.

A friend of mine got a suspicious tax returns email that had a link to a form
asking for credit card information. Being careful and responsible, my friend
of course asked me if the site looked legit before actually pressing 'submit'.

Of course it was a scam site, and using session recording, they could very
well have gotten my friend's credit card details without per pressing
'submit'.

I think it's always the context that decides whether something is malware. Is
a program that erases everything on your disk malware? Perhaps, but if it's a
disk formatting tool and you asked it to do so, then it's not.

~~~
dceddia
It's a good example, and something I often wonder about when I'm filling out a
survey and give up part-way -- did they save the questions I had already
answered?

FWIW you probably wouldn't need something as powerful or blunt as session
recording to pull this off, though. You'd only need to listen for keystrokes
on the relevant input (with document.addEventListener or similar), and send
them to the server as they're typed. Same with partially-filled surveys. IIRC
Facebook got in some heat a while ago for sending the partially-typed messages
up to the server _and_ to the other chat participant.

------
junetic
This is so awesome that you made this open source. There's a bunch of
companies that basically use this tech and making lots of $$. We're actually
working on a user testing service and currently using a chrome extension to
record video:

[https://www.userlook.co](https://www.userlook.co)

May consider switching and using your library!

Thanks for sharing this :)

------
sjroot
I played with the examples and I am extremely impressed. I couldn't tell at
all that I was being recorded; I figured the examples would just be videos of
people interacting with those pages. The speed-up feature is very neat too.

Very good work. I am actually a little surprised something like this is open
source.

------
echelon
You should offer a commercial and open source version. The commercial service
could provide a few extra features at a modest price point, but support
development of the open source platform. Perhaps it could pay your bills and
be a cheaper alternative to the existing expensive commercial offerings. I
could see you taking a big bite out of their market if you keep it up.

------
unao
really nice!

Few years ago, I created something very similar when working for
validately.com - user testing company. The solution was tailored for our needs
and was quite unique and rather sophisticated.

Below few main points:

\- automatic injection of recording script by proxing original site / app via
our domain (optionally user's could have inject the script by themselves)

\- using iframe to serve the recorded page in order to preserve context and
allow to display content on top of the page

\- audio recording

\- broadcasting in real-time

\- storing all assets from a recording (images / stylesheets) to make playing
back independent from original urls

Not everything was perfect and there were always something to improve. Some
sites did not work at all due to technical limitations. But the technology was
good enough so the company could grow and transit to webrtc based solution.

I am very grateful for this rare opportunity as the project taught me insane
amount of useful stuff. Would love to work on something similar again.

------
sbr464
Regarding security & privacy of products like these, I think it would be
interesting to not capture any data not included within the app's codebase by
default, instead of relying on any kind of redaction steps.

For example,

A table and interacting with a table, the cells would just be filler elements.
Form field data just wouldn't be captured, contents unique to any record on a
page would be filled with sample data.

That way you get to see how someone interacts with a page, but not any
context/personal information.

I think relying on sensible defaults or redacting data is a lost cause and
puts the trust/responsibility in the wrong hands. Some companies may care
about redaction while others don't prioritize it.

------
niffy
How does this compare against something like Full Story?

~~~
junetic
you can basically build fullstory with this library!

------
pklee
Very cool. So much better than loading youtube videos. Would be great if there
is a way to annotate parts of it. Apologies if it is a feature creep :(

~~~
yz-yu
Great idea! As I said on the landing page, demonstrate is an interesting use
case for rrweb and the annotation feature will make it even better.

------
welder
This is similar to Heap's Identify [1]. Main difference is Heap focuses on
metrics while this looks to be for debugging and UX. Any plans for a heatmap
feature built-in or third party?

[1]: [https://docs.heapanalytics.com/docs/using-
identify](https://docs.heapanalytics.com/docs/using-identify)

------
skilled
This looks awesome! Does anyone have recommendations for a tool that can do
this but also let me save the video? Preferably, below 100MB per 5sec
recording...

I mean, come on, how smooth is this library?

------
grezql
will this work with MVC frameworks like Angularjs, Vue and ReactJS?

I checked the DOM of a angularjs app and when I enter something in the input
field, its not appearing in the DOM at all.

------
yuchi
Very nice project. May I suggest to add "Chinese version" (in english) to the
link Chinese in the Readme?

------
js4ever
Interesting, seems like the core tech of smartlook / hotjar

~~~
ilrwbwrkhv
ya ill stop my hotjar subscription next month and use this. ill save over 500
dollar every month

~~~
terrycody
wait, call me dumb or whatever, but what this tool use for? I can't think of
any usage, any examples?

------
Walkman
This is NOT Open Source in a way you think it is yet, because the repo doesn't
have a LICENSE attached to it, so the owner own every right and you are not
allowed to sell it or do whatever you want with it.

~~~
insomniacity
Correct - does fair use cover even running software without a license?

This wouldn't pass the first hurdle at my dayjob...

------
hero76
neat. include that blog link in the readme

------
jagracey
Great work yz-yu. Hope you've learned a lot- I've personally found the session
replay space to be incredibly rewarding.

However, as a session replay industry competitor and a former security
researcher for most industry players, I caution anyone thinking of using a
side-project like this on production applications to proceed slowly with care.

Security and Privacy are extremely hard to get right here. The tricky thing
about session replay analytics is that attackers have a huge attack vector,
and compromise means gaining a treasure trove of all user data. The nature of
replay is in a way a form of XSS. Modern security features help (like CSPs,
iframe Sandbox attribute) but browser changes can cause issues.

Some of the challenges: \- CSPs can often be bypassed using Google API
libraries, <Object/>, <SVG> \- Blacklisting <SCRIPT/> tags can often be
bypassed with an XML namespace \- CSS based data or password exfiltration. \-
Clickjacking, "data:" urls etc. \- Could you imagine a web request proxy
server deploying Service Workers? \- postMsg() from further nested frames

Substantial work goes into sandboxing replay environments and limiting PII.
Defense in depth is particularly important here. Enterprise level research,
auditing, monitoring and care should be taken seriously.

~~~
yz-yu
Definitely learned a lot and enjoy the process and thanks for your really
important suggestions.

Quote from my introduction blog post:

===

Today we already have some commercial session replay products like Logrocket,
Fullstory, etc.

If you are just looking for a ready-to-use tool and would like to pay for its
service, I would recommend you to use the above products, because they have
well-tested backend services that can store the data for you and perform some
higher order features.

===

So I don't think rrweb is a competitor of these commercial products.

Actually, I would like to see rrweb grows into a base of many commercial
products in the future, which means it handles most of the privacy and
security issues, so the other developers can build many fancy projects base on
it without spending time on the hard part again and again.

~~~
dang
(Sorry your account was rate-limited! New accounts are subject to that
restriction but it's definitely not intended for cases like this. I've marked
your account legit so it won't happen again.)

------
randex
This is not working on replaying opening select elements.

