mvoodarla's submissions

1.		Veo 2: Our video generation model (deepmind.google)
		353 points by mvoodarla 11 hours ago \| past \| 172 comments
2.		Guide to pure-audio and audiovisual speaker recognition techniques (sievedata.com)
		1 point by mvoodarla 41 days ago \| past
3.		SieveSync: Realistic, zero-shot lipsync pipeline using MuseTalk and LivePortrait (github.com/sieve-community)
		1 point by mvoodarla 82 days ago \| past
4.		SieveSync: High-quality, zero-shot lipsync built with MuseTalk and LivePortrait (sievedata.com)
		4 points by mvoodarla 3 months ago \| past
5.		Running Meta's SAM2 2x faster (sievedata.com)
		3 points by mvoodarla 3 months ago \| past
6.		API to automate social video editing with AI (twitter.com/sievedata)
		1 point by mvoodarla 7 months ago \| past
7.		Finding highlights in long-form video automatically with custom search terms (sievedata.com)
		1 point by mvoodarla 8 months ago \| past
8.		Describe Beta: The most descriptive audiovisual summaries for videos (github.com/sieve-community)
		2 points by mvoodarla 9 months ago \| past \| 1 comment
9.		AI-generated sound effects for stock videos using CogVLM and AudioLDM (sievedata.com)
		1 point by mvoodarla 9 months ago \| past
10.		AI active speaker detection on video with a 90% speedup (sievedata.com)
		3 points by mvoodarla 9 months ago \| past
11.		Masked Audio Generation Using a Single Non-Autoregressive Transformer (huji.ac.il)
		1 point by mvoodarla 10 months ago \| past
12.		Masked Audio Generation Using a Single Non-Autoregressive Transformer (arxiv.org)
		1 point by mvoodarla 11 months ago \| past
13.		The most cost-effective audio transcription API (sievedata.com)
		3 points by mvoodarla on Dec 12, 2023 \| past
14.		Audiobox Demo: Where anyone can make a sound with an idea (metademolab.com)
		3 points by mvoodarla on Dec 11, 2023 \| past
15.		Audiobox: Generating audio from voice and natural language prompts (meta.com)
		3 points by mvoodarla on Dec 7, 2023 \| past \| 1 comment
16.		Improving on open-source for fast, high-quality AI lipsyncing (sievedata.com)
		2 points by mvoodarla on Nov 22, 2023 \| past
17.		Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning (twitter.com/aiatmeta)
		3 points by mvoodarla on Nov 16, 2023 \| past
18.		Show HN: State of the art audio enhance (open source AudioSR and DeepFilterNet) (sievedata.com)
		8 points by mvoodarla on Oct 19, 2023 \| past \| 1 comment
19.		DA-Clip: Controlling Vision-Language Models for Universal Image Restoration (twitter.com/fffiloni)
		1 point by mvoodarla on Oct 16, 2023 \| past
20.		Voicebox: The first generative AI model for speech to generalize across tasks (facebook.com)
		7 points by mvoodarla on June 16, 2023 \| past \| 2 comments
21.		DINOv2: State-of-the-art computer vision models with self-supervised learning (metademolab.com)
		176 points by mvoodarla on April 17, 2023 \| past \| 16 comments
22.		Show HN: TrackObject – Drag-and-drop computer vision object tracking (trackobject.xyz)
		23 points by mvoodarla on July 19, 2022 \| past \| 3 comments
23.		Pix2Seq: A New Language Interface for Object Detection (googleblog.com)
		1 point by mvoodarla on April 25, 2022 \| past
24.		Ask HN: Building vision systems without knowing ML?
		1 point by mvoodarla on Feb 22, 2022 \| past
25.		Launch HN: Sieve (YC W22) – Pluggable APIs for Video Search (sievedata.com)
		71 points by mvoodarla on Feb 2, 2022 \| past \| 14 comments
26.		Show HN: Processing 24 hours of video in ten minutes (sievedata.com)
		77 points by mvoodarla on Jan 11, 2022 \| past \| 37 comments
27.		Turning petabytes of raw video data into a high-quality ML dataset (medium.com/mvoodarla)
		3 points by mvoodarla on Dec 30, 2021 \| past \| 2 comments