
Show HN: A Python Script to Download Thousands of Wallpapers at Once - geekspin
https://github.com/GeekSpin/Wallhaven-dl
======
slenk
Side question - are there any websites like wallhaven, but with less people
and anime? I'm thinking the type of content Chromecast uses, or
/r/TechnologyPorn?

~~~
xemoka
Have you seen [http://interfacelift.com/](http://interfacelift.com/) ?

------
shubb
You might want to async the downloading part. It is often faster to download 5
images at once than to download them one after another.

In the choices section you use input("text") properly one place but not
others. You use a couple different ways of decoding codes, and the dict way is
nicer, but also consistency is nice. Also, I am not sure you handle bad input
(default all?)...

Personally, I'd pull the meat out of the loops in main() into functions -
GetImageList() and GetImage(). Relatively complex there so it would be easier
to read and spot errors in those bits of code in isolation.

~~~
ASalazarMX
Unless you're in dire need of thousands of wallpapers (most of them you are
going to delete anyway) it's better not to hammer the website. I'd even limit
the download rate.

------
jfz
Using i and j in loops makes code much less self-explanatory, especially when
you could use "page_index" and "image_index" instead.

~~~
lou1306
Even better, replacing

    
    
         for i in range(len(imgid)):
    

and similar lines with:

    
    
        for i, img in enumerate(imgid):
    

would allow one to get rid of all these list accessors.

------
myroon5
It looks like this downloads them one-by-one instead of downloading at the
same time.

Is there a simple Python equivalent to Java's .parallelStream().forEach() that
would allow these calls to easily be run in parallel?

~~~
mynewtb
Sure thing...

    
    
        from multiprocessing import Pool
        with Pool(8) as p:
            p.map(function, sequence))

~~~
myroon5
Neat.

I see that you explicitly specify 8 as the number of processes here. In Java,
parallelStream() will pick a sane default for you if you haven't previously
specified (based on the number of available processors, I believe). Is
something like that possible in Python?

~~~
jstarfish
The ThreadPool class picks a sane default (number of cores), but I believe it
uses Python threads instead of processes.

------
kpenc
Every function has side effects. Could've written the whole thing in one
function. Also, PEP8.

~~~
apetresc
One function? Could've written the whole thing in one `wget` call.

~~~
pknerd
teach me how.

~~~
apetresc
Here's a start:

    
    
      for x in $(curl -s https://alpha.wallhaven.cc/random | pcregrep -o1 "https://alpha.wallhaven.cc/wallpaper/(\d+)" | sort | uniq) ; do wget "https://wallpapers.wallhaven.cc/wallpapers/full/wallhaven-$x.jpg" ; done

~~~
akx
bash, curl, pcregrep, sort, uniq and wget is not what I'd call "one wget
call".

~~~
slenk
To the hardcore bash users, I think they call that 'easy'. I have someone like
that on my team - holy crap some of the bash stuff they can come up with

~~~
lou1306
Hardcore bash sure sounds like fun, but when things start getting too big or
messy I usually find a Python script with some `subprocess` [0] tricks to be
way easier on the eye.

[0]:
[https://docs.python.org/3/library/subprocess.html](https://docs.python.org/3/library/subprocess.html)

