Hacker News new | past | comments | ask | show | jobs | submit login
Fastest Way to Delete Large Folders in Windows (2015) (mattpilz.com)
109 points by megahz on July 18, 2017 | hide | past | favorite | 107 comments



Unless you are extraordinarily low on disk space, you should never need to "wait" to delete things so speed shouldn't matter.

The correct method is to immediately move/rename the target, then launch an expensive delete in the background. That frees up the location you are trying to use.

I used to see people set up entire workflows that started with, essentially, "wait 20 minutes to finish recursively deleting the previous data" instead of just "move it out of the way and start immediately".


I needed to delete 70GB and 100,000s files across 10 machines last week. It took an hour using the command method on each machine. Using the GUI froze windows and even if it worked would take at least a day.


Linux use to have a very similar problem but it was finally fixed in kernel 4.10. Surprised windows suffers from the same thing and now it's got me wondering if MacOS does this too...


Mac is horribly slow when using the gui. Last time I had to I realized it was faster to copy the stuff I wanted off a disk and just format it rather than wait for files to delete.


Do you have a link about the 4.10 change you mentioned?


MacOS is much faster than windows for similar operations


However, on some systems I've seen the delete performance wedge the disk. I'm going to blame cheap SSDs for this; you could watch it delete the first few thousand files, and then requests on that drive slowed to a crawl. Including other file creates and writes. I suspect this was due to syncing each delete change, which incurs a rewrite of a whole Flash block. After a short while the drive runs out of prepared blank Flash and has to start slowly erasing some more.

And that's using rmdir.


Wedge the disk is exactly what this should be called.


Of course, ideally windows should do that behind the curtains itself...


That's essentially what sending to the Trash does, AFAIK.


Yes, but then emptying the trash takes forever.

I mean when Shift+Deleting, or emptying the trash, Windows should:

- mark the folder as "garbage" in the filesystem and rename it to something impossible to free up the name

- hide the folder from the UI (apps should not see the file anymore)

- schedule a real deletion (reclaiming the space) with low priority in the background

- in case of a crash, chkdisk should detect garbage files and delete them.


I remember somebody at Microsoft explained the trash in Windows to me in a similar way. The problem is that from observation it doesn;t seem to work that way....


Who cares how long it takes to empty the trash? It solves the problem of freeing the name and removing it from view of the user instantly. Your solution wouldn't make emptying the trash any faster.


Emptying the trash slows down the machine a lot so other processes are suffering.


Anything in windows seems to have the potential to slow down the machine a lot. I'm not sure why that is but process priorities don't seem to be well implemented.

Turning off Windows Defender makes Windows Update run much faster.


Sending to the trash doesn't free the space until you empty the trash.


Yes but it frees the path to be used for new/replacement files "immediately".


Not if you don't have the space.


Which is why the first comment in the chain led with that caveat.


iirc trash in Windows has a quota so it should trim eventually


Yep. Recycle bin is a hidden folder that exists on every drive.


Sending to the trash still takes an eternity.


Wasn't at some distant time in the past deleting done by just renaming a file and changing the first character to a question mark? I think that also applied to folders: Delete a folder's "file", and all the contents are gone, recursively. Not sure how they found free blocks when creating a file though. One downside is that you never knew how much space you had left. I think at some point they changed it so that all file operations updated the "free space left" label in the partition. This may have been FAT around Win98, but maybe I'm misremembering or it may be some more obscure file system I'm thinking of...


[I'm assuming you mean DOS FAT filesystems (FAT12, FAT16, FAT32) - anything else might be different]

> Wasn't at some distant time in the past deleting done by just renaming a file and changing the first character to a question mark?

That is just the entry in the directory listing - it would still walk the file allocation table and mark the relevant blocks as unused. The FAT operated as a collection of linked lists - the directory listing mentions the first block and each entry in the FAT states which the next block is for a file or 0 if it is the last block. Really a sub-directory was just a special file containing filenames and other properties so deleting an empty directory follows the same process. The root directory is a special case, being of fixed length (12 blocks IIRC) in a fixed position of the structure.

> I think that also applied to folders: Delete a folder's "file", and all the contents are gone, recursively.

No, you couldn't delete a directory with contents by default. Ordering a recursive delete would perform a depth-first search-and-delete on each individual object.

> Not sure how they found free blocks when creating a file though.

As the FAT had been updated, it was a simple first-empty-block search.

Unless something had deleted something just by editing the directory entry, in which case the blocks would be left alone as they would still be marked as in use. I did see this used as a way to try hide sensitive information without it getting overwritten, both as a naive copy prevention mechanism and a "hide my porn" technique.


Apparently for NTFS it is a bit in the MFT flags.

https://www.codeproject.com/Articles/9293/Undelete-a-file-in...

If this flag has bit one set, it means that the file is in-use, else it is deleted.


That's how DOS worked with FAT16, yes. It will still apply to files on USB and SD cards (FAT32), but internal Windows drives will be formatted with NTFS.

The FAT entries are cleared, so if the file is fragmented, reassembling requires knowledge of the format. Also, Wikipedia tells me that FAT32 directory entries also have some bits cleared, so the right data on disk may not be found.


That was CP/M-80 filesystem. The free-space allocation map was created when the disc was mounted, by scanning all file entries. Free space was available via STAT (afair).

DOS (FAT) used an allocation list on disc. CP/M couldn't corrupt the free-list (didn't have one). Unix is a different animal. CP/M didn't have "folders" (user areas instead).


This really sounds like the FAT file system


If you're doing this regularly, a good alternative is to install on a separate partition and symlink it. Then to delete, you just quick format the partition.

  echo Y | format Z: /FS:NTFS /X /Q
source: https://superuser.com/a/352321

Also, on Windows 10 you can use Ubuntu bash and run rm -rf. I've read that it's faster, but haven't tested extensively.

Update: I did a quick test with 100,000 files and the 'del /f/s/q foldername > nul' approach was about 50% faster than 'rm -rf' on my machine.


for that matter if you use the msys *nix tools (comes with git for windows), you get a bash prompt in windows and can do the same... I find for general terminal stuff the msys bash is more to my liking... though I haven't tried the ubuntu bash in a while (it irked me in many ways).


He's not handling spaces in directory names in the "cd %1" portion of his context-menu entry. Fortunately his "fastdel.bat" should prompt the user and, hopefully, they'll catch the mistake.

Calling CMD.EXE from Explorer without double-quoting can have unintended consequences when ampersands and parentheses are present in the filename too. I'd hate to "Fast Delete" a directory named "& rd /s /q %SystemRoot% &" using his shell context menu entry.

If I was hell-bent on doing this I'd probably add it to my user registry w/ the command:

   reg add "HKEY_CURRENT_USER\Software\Classes\Directory\shell\Fast Delete\command" /d "cmd /c rd /s ""%1"""
There's still probably some fun metacharacter injection possibilities there, too, however.


Unfortunately, RMDIR as well as other windows commands suffer from the long path issue. You can turn on long path name support in win10 but that did break our regular build tooling.

My current way to delete any folder regardless depth of the tree is to use robocopy (robocopy D:\EmptyFolder D:\FolderToDelete /MIR). It is actually pretty damn fast, might be faster than using RMDIR /S /Q.


This whole page is making me claustrophobic with visions of so many things that can go wrong for trying to optimize the folder delete... Compounded by a helpful samaritan who is offering an EXE download to set up a batch file... :(


The answer to this used to be Visual SourceSafe.


SourceSafe is never the answer.


"What source control system makes you doubt the logic of using source control altogether?"


Now you have two problems.


But after a short while the problems go away because, well source safe means you'll never have access to those files again and sourcesafe itself will become unusable solving both problems (someday you'll have to buy a new computer due to OS rot, but that's expected)


Visual Sorta Safe, as some people I used to know called it.


At some point/scale, quick format would be even faster.


Reading this article reminds me how tedious it is to customise Windows. All these manual steps in system dialogs, registry and so on have to be repeated every time you want to re-install Windows.

I wanted to run a command as admin on startup recently and had to create a scheduled task for this, which of course will be lost the next time I reinstall. Same for services, etc.


You must now know about Powershell. For example, scheduled tasks: https://blogs.technet.microsoft.com/heyscriptingguy/2015/01/...


I haven't used Windows seriously for a while, but I'm pretty sure you can use filenames without extensions now.

Doesn't that mean that the "." pattern won't match everything?


If you mean * .* [1], it's special cased so it still works (and .* as a suffix in general will match no suffix).

https://blogs.msdn.microsoft.com/oldnewthing/20071217-00/?p=...

[1] Seeing as I fell into it too, it's probably the intuitive formatting codes that messed it up, not the parent poster themselves.


Technical interviews at my company are 1 question: how do you format a complex regex on hacker news?


    *monospace it*


Btw, how do you do that?

Also how do you create lists?


I don't know of any list-creating functionality. Lines that begin with four spaces get monospaced.


>I haven't used Windows seriously for a while, but I'm pretty sure you can use filenames without extensions now.

Wasn't this always possible? even from DOS days, in fact?


Yes and no.

In DOS, the filename is always FILENAME.EXT, and dot is not an arbitrary character is the name, but a separator. In FAT16 directory entries, you have two separate fields, 8 chars for filename, 3 chars for extension, both space padded. The dot isn't actually recorded, and only appears in reconstituted filenames. Consequently, it's not possible to distinguish "FILENAME" from "FILENAME.", and they are considered equivalent.

Similarly, when you write something like * .* in DOS, the dot is a special symbol as well, and this really means "match any filename and any extension". So this will match FILENAME, because it still has an extension, which just happens to be blank. On the other hand, if you just use * , it will be treated as * . and hence only match files with a blank extension.

In Win32, filenames are just strings, and dot is just a character. But the last dot in the name is still considered as separating extension from the name for purposes where it matters, like determining the app to open the file - i.e. "foo.bar.baz" is "foo.bar" with extension "baz". And if there are no dots, then the file is still considered to have an empty extension rather than no extension, and will therefore match ⁠ . and similar globs. On the other hand, * is no longer treated as *⁠ . and matches any filename now.


Thanks for the detailed explanation.


Note that these solutions will bypass the trashcan, which is why they shouldn't really be used willy-nilly.

They will also trip up in the same way Explorer does, on paths that are too long.


Summary - use command line tools


Does anyone know how PowerShell compares to cmd.exe for this particular problem? In PowerShell, one would type:

  rm -Recurse -Force $path
It has the advantage that it only takes a single command, but I am not entirely certain about the performance.


My experience has been that Powershell is as fast as cmd for this task.

EDIT: As a test, I cloned my local Maven repository a few times. This resulted in 48,613 files, 16,590 folders, and 2.71 GB on disk. Here's the result of:

    Measure-Command { rm '.\.m2 - Copy' -Recurse -Force }

    TotalDays         : 0.00196887874421296
    TotalHours        : 0.0472530898611111
    TotalMinutes      : 2.83518539166667
    TotalSeconds      : 170.1111235
    TotalMilliseconds : 170111.1235
~~And it had a CPU pegged the entire time. So no, Powershell is still terrible at this. Stick with cmd.~~

EDIT2: Tried it again, with RMDIR and using the timing script found here:

https://stackoverflow.com/a/6209392

    timecmd "RMDIR /S /Q .m2c > NUL"

    command took 0:2:40.15 (160.15s total)
So within the same magnitude of time.

I might try it one more time after lunch with the DEL followed by RMDIR combo to see if that changes anything.


Thank you very much!!!


Had similar timing for DEL, so it looks like Powershell is as good as cmd for this use case:

    timecmd "DEL /F /Q /S .m2c\*.* > NUL"

    command took 0:2:57.60 (177.60s total)


It's either same speed or slower. I once wrote a commend line delete tool with Win32 and threading. That was quite a bit faster than the built in delete commend.


Similar issue deleting files through OS X Finder, also solved by going to the command line.


Okay but presumably it's doing something with that extra time, is it valuable?


It's calculating the amount of space you've freed up and displaying it in a pretty graph along with the name of every file and folder that is being deleted.

This might be helpful if the filenames and sizes make you realize that you're trying to delete the wrong folder so that you can abort immediately. Other than that, it's just eye candy.


There is rarely any excuse for having a UI task that takes longer than a few seconds and doesn't have a progress indicator. That basically means that if you want to perform a 30 second task you should show that progress bar even if that now means it's a 60 second task.

I'd probably try to cheat in this particular scenario either by keeping a best guess for the recursive delete complexity in the file system itself, or by simply showing a worse progress indicator such as a counter of files without a total count. The progress indication doesn't necessarily need to include time remaining, especially when the cost is this high.


> The progress indication doesn't necessarily need to include time remaining, especially when the cost is this high.

Exactly. I don't particularly care which file out of half a million it's on -- I just want to know at a glance if it's still running or has somehow frozen/locked up.


Probably also checking if you have permission to delete said file/folder.


Now if we could only delete files that are allegedly in use by another process.


Or, it could at least identify the process locking the file and offer to kill it.


There's handle.exe, a SysInternals Tool that helps with that

http://technet.microsoft.com/en-us/sysinternals/bb896655.asp...


Yeah, or procexp, and it's even possible with resmon that's built in. But for the average user it's basically impossible.


Never reached a case where shift + del wasn't fast enough, interesting


I restore 35mm film scans (movies) and I sometimes have to delete six figures of raw frames (many TBs in total) from my write-RAID when a big intermediate render doesn't go the way I want it to.


But but but, how many individual files ? FS deletion is not the same as erasing all Bytes. I can delete a 128PB file in an instant, by crossing it out of the index.


> six figures of raw frames


I will suggest to try "Long Path Tool" program.


On a related note, Cygwin is also faster that Windows Explorer, although I don't know how it compares to RMDIR.


Is there actually a reason to do both del and rmdir instead of just rmdir? Or is the post just being superstitious?


rmdir will only remove visible non-system files, and will fail with the non-obvious "The directory not empty" when failing to delete one such file and subsequently not being able to delete the corresponding directory.

An alternative would be to use dir + attrib to make all files visible (I don't know that stripping out the system flag by default is a good idea) before running rmdir.


Try using long path tool . It worked for me. I hope it helps.



WizTree is faster since is uses only the NTFS MFT (eerily similar to the command line vs. Windows Explorer comparison in the article).

https://antibody-software.com/web/software/software/wiztree-...


rimraf is extremely fast. Started using it to delete large node_modules folders and got used to it.


Note that the latest npm has changed its behaviour for folder links - if you refer to a dependency by a folder name ("fred":"file:../fred", that sort of thing) rather than npm version or git link (etc.), npm now creates a junction inside node_modules that links directly to the original folder.

Deletion tools that don't know how to distinguish junctions and folders may then find the original files via the junction and delete them...

(del and rmdir don't suffer from this. I did get a strange error message from rmdir, though, and it didn't actually delete the junction. GNU-Win32's rm blithely follows the junction and deletes everything in it.)


Legitimate question - why/how has this made it to the front page?


Because developers may need to delete large folders in Windows


That deleting (and moving) large folders is so slow in Windows is a major pain point for node.js developers.


I recently went to an nvme drive... didn't even notice it being particularly faster until I worked on one of my larger node projects... was about 3x as fast as my sata ssds. Never going back to HDD if I can avoid it, I have one for video transcodes, that's it though.


This is incredibly basic computer knowledge. Has Windows actually fallen so far out of the loop that there are developers who know Rust or Go, but don't know something this fundamental?


I neither know Rust nor Go, and I didn't knew about these operations in the Windows. But I get surprised when some other stuff end up in the front page. People have different scopes and the multidisciplinary of HN makes it a valuable network.


It's incredibly basic computer knowledge that a modern day OS is so badly designed that it requires the user to drop into the command line to delete large folders efficiently because otherwise, it spends most of its time telling you how long it will take to tell you how long it will take?

That's one hell of an indictment of Windows!


There's some truth to it though - the Windows Explorer file operations are flaky as hell.

I was just trying to copy files between an old system drive and a new one but I kept getting infinite recursion issues with the "Documents and Settings" (which is linked to Users) directory, permission issues, etc. Even as the SYSTEM user. I wound up having to learn a little robocopy for it.


Yeah, since Windows Vista/7, when Documents and Settings was moved to Users, the system creates new user directories with a bunch of symbolic links/junctions that don't seem to be properly created and always cause issues when copying the directory to another place or deleting the directory. I always get this problem when moving my profile directory out of the system partition.

Incidentally, it's amazing that Windows still doesn't let you specify a custom location to create the profile directory of a new user.


I work with Microsoft stuff every day, but I didn't know the little registry trick at the end, so that's handy.


Windows users vastly prefer modern GUIs instead of old, archaic command line interfaces that were invented in the 1960s.

So, I don't see how these commands should be considered fundamental to those users.


As someone who uses these "old, archaic command-line interfaces" all day, every day, on Linux, Windows, and macOS, I find this characterization very entertaining.

This is an age-old argument (GUI vs command line), and I think each generation coming into computing via the GUI will initially agree with you, but after some time trying to do things that are slow and frustrating using the GUI, will break out into song and dance on discovering the ease and speed with which the command line can do them.

It's all about using the right tool for the job. GUI is awesome for many things, the command line is better for many others.


GUIs are objectively better for every single task. Given a proper choice, 100% of users would choose a GUI over the command line. I have no doubt about this in my mind.

Unfortunately, GUIs are still more difficult to build well. Once that is no longer true, the command line will soon cease to exist.


> GUIs are objectively better for every single task

If they are objectively better for every single task, there will be metrics and studies that prove this objectively. Kindly cite these studies and metrics. I don't think they exist.

I refute the quoted statement with two words: headless servers. Oh, and REPLs. And CLIs embedded in so-called GUIs. In many cases, GUIs contain command-line emulators. Think of, say, Wireshark's filters, or any JavaScript console.

Maybe you are railing against text-mode displays, as opposed to terminal emulation programs that use graphics to emulate a text-mode display?

> Given a proper choice... No true Scotsman...?


Can you even imagine Turing-complete GUI? That's what would be necessary to rival Turing-completeness of scripting in CLI.

And even if such GUI would be created, I don't think it would be any easier to understand than learning CLI commands. And it would require significant overhead to use; consider for example how you would search files for content matching a regex - in CLI I simply type in the regex one-liner, but in GUI I guess I'd have to click around visually building the regex? This would be a nightmare to use.

GUIs definitely are not a good tool for vast range of tasks.


Why would I even want a Turing complete interface? So I can program it? No thanks! Sounds tedious. I want to get my work done, not create more work.

Regex is also not the CLI, it's a text-pattern matching DSL that you enter into your CLI as an argument. There's nothing stopping anyone from using regex patterns in a GUI.

If I had to type regex patterns I'd much, much rather type them into a GUI instead of a CLI because when I do this, the results can be instantly actionable without any further thought. Instead of having to do more CLI-programming to act on the results, I can typically act on them immediately in a GUI.

Better than regex though would be a simple GUI for handling the most common cases (match case, match whole-word, etc - the options that most IDEs offer) and a yes - visual builder for making complex rules for the more complex cases. I'll take a well-built visual builder any day of the week over a textual representation of a regex pattern that I have to use rote memorization or external references to understand.

Today's GUI systems are definitely not a good tool for a vast range of tasks. That's not my point though. GUIs have the potential to be way, way better than any CLI but unfortunately there are a lot of things holding them back like market forces and unimaginative people. I fully expect those things to change at some point, but probably not in my lifetime.


> Why would I even want a Turing complete interface? So I can program it? No thanks! Sounds tedious. I want to get my work done, not create more work.

Well, you wouldn't, I would. Much better to get the work done, perhaps save it in a small script, and next time simply run it. Yes, I know GUIs tend to offer macros, but no way I'm going to trust them doing something without being able to see exactly what they try to do.

> Regex is also not the CLI, it's a text-pattern matching DSL that you enter into your CLI as an argument. There's nothing stopping anyone from using regex patterns in a GUI.

So you're not arguing for a full GUI, you still expect parts of the input to be entered as some cryptic text commands (which essentially is the same as CLI).

> I'll take a well-built visual builder any day of the week over a textual representation of a regex pattern that I have to use rote memorization or external references to understand.

That's fine, but it'd still be much slower to click out a complex regex in GUI, than to just type it out in CLI (not to mention I'd have to check the text regex generated by the GUI anyway, to make sure whatever I clicked out is actually what I want). You could say it's a good tradeoff of convenience vs speed, but that doesn't make GUI nowhere near "objectively better for every single task".


How: People have up-voted the story.

Why: People found it interesting.


When I posted my comment, the story had 16 points. I've seen other stories with many more points which have not made the front page. Hence my question.


Oh, you're asking about the HN ranking algorithm. I wrote about it a few years ago: http://www.righto.com/2013/11/how-hacker-news-ranking-really...

The quick answer to your question is that 16 votes in a short time beats more votes over a long time. And there are also various penalties.


It's one of those topics that's interesting because it seems obvious that Windows is doing it completely wrong by not starting the deleting immediately, and doing reporting in a separate thread.

Similar to the popularity of this topic: http://blog.zorinaq.com/i-contribute-to-the-windows-kernel-w...


Install Linux?


Can't you just SHIFT+DELETE?


From the article:

> There is, in fact, a significant amount of of overhead when you trigger the standard delete action in Windows including when either emptying the Recycle Bin or directly deleting files via Shift+Del.

> Upon deleting the ~46,000 files from the NDK package, it took 38 seconds with console output enabled and 29 seconds with output disabled on a standard non-SSD hard drive, scraping off a quarter of the time. By comparison, the same deletion process via a standard Shift+Del in Windows Explorer took an agonizing 11 minutes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: