Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: I made an iOS app recording RGBD videos and a web app playing them (telegie.com)
197 points by hanseul on Feb 23, 2022 | hide | past | favorite | 55 comments



Hi everyone,

My name is Hanseul Jun who is currently a graduate student working on augmented reality and telepresence systems. During my graduate studies, I have been working on code related to this app since four years ago, always wanting to bring telepresence technology outside of labs.

Hope you enjoy this! You can also share a RGBD video via a URL like this example using my app (https://apps.apple.com/us/app/telegie/id1593918560) using a iphone that contains a face ID camera or a LIDAR camera. As I have been enjoying reading comments from here and any comments would be appreciated!


I have to say this is amazing! Really good work.

Might I suggest slightly more interesting examples than kicking a ball from your desk. What about a cat playing with a toy? Or a model train going round a track?


Thanks a lot!! I am currently working on my dissertation, while developing this system, and kicking a ball while setting up the dissertation study kind of seemed interesting to me personally but I agree that it's not something interesting for others... Next time I'll at least find a cat!!


I didn't have cat toys nearby, but I give you my yawning cat as another example: https://telegie.com/v26/videos/QyLZsSZ21xp


Thanks a lot!! This is definitely a much more interesting video and you have a lovely cat!


Awesome! It looks like we could extract the depth from the video, but my knowledge of ffmpeg is too limited for that. Do you know a way to do it?


By default, mp4 files from iphones do not include depth. The RGBD videos from telegies are custom mkv files that include depth (done by directly using libmatroska, not via ffmpeg), which the webapp can display.


getting real braindance vibes [0] - for added fun try adding multiple sources with sequence alignment w/ e.g. dynamic warping based on the correlation between the audio tracks of 2 RGBD videos! Then look into NERFs! [1]

[0] Cyberpunk 2077 - Braindance Gameplay - https://www.youtube.com/watch?v=KXXGS3MGCro

[1] https://phog.github.io/snerg/


This is so cool! I’ve been playing around for a while now with different objects and types of lighting. Now I’m getting this error when I try to post: {"timestamp":"2022-02-23T05:57:19.930+00:00","status":500,"error":"Internal Server Error","path":"/record"}


I'm really sorry about this... I don't think I can fix it at the moment but I'll keep work on making the app more stable. And thanks a lot for trying the app!!!


This is seriously cool, great work! You can get that “Radiohead video” effect by pausing and panning around a video of your face :)

I might suggest changing the name of this post as I had no idea what RGBD was initially, maybe “I made an iOS app recording videos with depth information and a web to play and explore them in 3D” or something (that’s probably too long for HN!)


Spending several years in a VR lab made me start thinking RGBD was some common term, but for sure I'll follow your advice and use more normal words next time. And you were right that it's probably too long since the above title was few letters short the limit already!


Wow, this reminds me of a scene in Minority report https://www.youtube.com/watch?v=OSDqZeI2WlA


I remember watching Minority report again last year and being really surprised watching this scene!! I was like that looks like my research project but this movie is 20 years old???


Maybe we can use a neural net to fill in the shadowed regions. I have only looked at 2D inpainting algorithms and not sure what the state-of-the-art in 3D inpainting is though.


Right, the shadowed region does make the scene look less realistic, since there is no such thing in the real world. While the lazy part of me says it's okay to leave it this way, 3D inpainting is definitely something I should look into.


Wait, what? My humble phone frontcam had depth all this time??


FYI: RGB-D = an RGB image with per-pixel depth information, as for example captured by devices like Microsoft Kinect.


I think I'm missing a lot of context here. Specialist cameras repurposes into blurry pixelated streaming devices?


I think there are two missing things here. One is that this is not coming from a specialist camera, but from iphone cameras. Many mobile phones (including some android ones) already have hardware to capture depth. And the next is about color information. This video has no information loss in terms of color compared to ordinary videos. It's just that they are mapped into 3D space. The example you are watching has 720p color resolution.


My socks are blown off.

If you could add a feature to measure distance when paused, it’d make renovation planning so much less stressful - I’d just walk around the place while recording a video and could measure anything post factum. No more ‘will it fit?’

(Guess who’s doing a renovation now)


I’ve used this: https://apps.apple.com/nl/app/3d-scanner-app/id1419913995?l=...

Some places like IKEA will let you get a model of the furniture that you can drop into something like Blender to see how it fits.

When scanning my wife’s childhood home we found enough space for a large hidden room in the house. We couldn’t figure out how to access it without looking too suspicious (we didn’t want anyone to know we knew).


You can do it already if you 3D-scan your place with this for example https://poly.cam/


That sounds like a cool idea! Maybe I can make this website to support phone AR, then you can place the video on top of some ruler or furniture for your measuring purposes? It's nice to hear about potential use cases and good luck with your renovation!


Nice work! I’m pretty sure it will only work with the last one or two gens of the phone, but it gives you an idea of what Apple is doing, under the hood.

Apple’s ARKit is pretty cool. I’d like to do more with it, but I haven’t had an excuse, so far.


Thanks, I agree that what Apple is doing is pretty cool, both hardware and software-wise. About supporting devices, while the better quality rear depth cameras are only available for 12/13 pro devices, the front depth camera is included in all iphones supporting face ID (3 years old), which is sufficient for running this app!


In case someone is interested in recording RGB-D videos for offline processing, I wrote an app for this https://apps.apple.com/us/app/stray-scanner/id1557051662. It simply writes the raw data files to disk, which you can export to the cloud/your computer.


This is really awesome. I think the ability to record the camera position / perspective changes atop the video (while shooting and after shooting) for export to flat video or gif would be really cool. It would allow people to record creative clips and share them on existing platforms.


This definitely gives off iGoggle vibes, would love to experience this with an A/R headset.


The codebase was originally written for a HoloLens app, so you sort of correctly guessed where it is coming from! I think this app works the best with AR headsets and it also supports AR headsets through WebXR. It's just that the AR headsets are really hard to find outside of labs...


This is great! Any plans for a point cloud/ 3d reconstruction app?


Between this app with RGBD videos and 3D reconstruction, I think in the future, there can be some software turning RGBD videos to reconstructed point cloud/mesh, so that you don't need to run a dedicated 3D reconstruction app at the time recording. Maybe I'll be able to work on it like... 3 years later...?


I don't understand.


Have you tried the virtual joystick (bottom left) ?

You can _move_ within the video


You should add a component that tracks your head in the browser and then re-renders the perspective. Should be fairly easy and would make it a lot more immersive.


While that is a great which I will definitely consider for the future, this website at least supports WebXR. In case you have a VR headset, please consider trying to run this video using one!


I did run it in my Quest which really makes the experience quite on another level.

Did you record those videos holding the camera in your hand? Because they have a shake to it that does not correspond to my heqd movement.

So if anyone wants to try it in the headset be sure that you are not nauseated by movements that don't correspond to you own head


Thanks for trying it with the quest and for the feedback. I really haven't thought about this issue. Yes, you are right that I have held the camera with my hand. I'll think about this and probably will need to do something equivalent to video stabilization.


The easiest thing would probably be to just have the first video the people see be shot on a tripod, that will show everyone what it can be. I have to say that I am actually thinking about using something like that for our VR fitness game. In that case the scenes would be shot from a fixed angle anyway.

I'd just have to figure out how to get that into GodotEngine and without killing the performance.


I have no idea what I'm looking at and must have missed a memo

oh, the D is Depth? Is the depth measured by a sensor or determined semantically via ML or something?


Yes, D is depth! Sorry that it wasn't clear. The depth here comes from sensors.


So its using the power of face id/true depth?

Its amazing! 5 star...


Thank you, thank you!!


Solid!! Have you considered working with deep learning type stuff?


Thank you!! I have no experience in deep learning yet, but I can see it being useful for this system and would like to learn about it!


Is this different from Snapchat/SparkAR 3D effects?


Would be nice though if this would ask for confirmation before starting to download 11mb. While not a lot. Some data plans are capped


FYI, Almost half the current top 10 are ~10MB or more, and the rest are ~3MB.

- This at 11MB

- Simula One homepage weighing it at 72MB (wtf)

- Aesthetics job posting is 9MB (to display a redirection link...)

- Typefully (tweet 2 blog) is 10Mb

I dont think its possible to browse the web these days and manage a reasonable data cap when a handful of sites can consume so much :(


nice work. Can you point me to resources on learning about capturing depth from iPhone cameras?


As I had prior experience with RGBD cameras, for iPhones, I've learned how to use their library than the fundamentals. For example, rear depth information comes from ARKit and more specifically from ARFrame and capturedDepthData property: https://developer.apple.com/documentation/arkit/arframe/2928....


Amazing!!!


[flagged]


While I cannot agree with deleting this site, which would be quite a sad thing for me, I agree that the website can be improved a lot regarding this. I am planning to make the website to download more gradually than downloading the whole video first, which will let you escape from the site without downloading that much!


Google's homepage is 2.3mb right now. This page loads 16.2mb, so about as heavy as loading Google 7x.

For comparison, Youtube's homepage is 14.2mb for me.

So really it is not that much.


That's a good rule for products, but this is HN. As a reader, half-baked projects are super welcome.


That seems a bit drastic!!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: