

Ask YC: Your ugliest hacks. - matth

Can be whatever you like.<p>At the moment I'm working on something that parses eBay URLs and let's just say this isn't shaping up to be pretty. The following is a prototype which extracts search keywords from URLs:<p><pre><code>    def getKW(url):
        kw = ''
        if getDomain(url) == 'motors.shop.ebay.com':
            #url = "http://motors.shop.ebay.com/Parts-Accessories_Car-Truck-Parts-Accessories__f350-wheels-20_W0QQ_fxdZ1QQ_osacatZPartsQ2dAccessoriesQQ_trksidZm270Q2el1313&#38;caz.html"
            #f350-wheels-20
            kw = urlparse(url)[2].split('__')[1].split('_W0QQ')[0]
        if getDomain(url) == 'cgi.ebay.com':
            #'http://cgi.ebay.com/ebaymotors/FACTORY-15-Mercedes-E320-300E-OEM-Chrome-Wheels-Rims_W0QQitemZ290277158739QQihZ019QQcategoryZ43955QQssPageNameZWDVWQQrdZ1QQcmdZViewItem&#38;caz.html'
            #FACTORY-15-Mercedes-E320-300E-OEM-Chrome-Wheels-Rims
            kw = urlparse(url)[2].split('/ebaymotors/')[1].split('_W0QQ')[0]
        if getDomain(url) == 'shop.ebay.com':
            #url = 'http://shop.ebay.com/items/__mini-cooper-rims?_trkparms=72%3A543%7C66%3A2%7C65%3A12%7C39%3A1&#38;caz.html'
            #mini-cooper-rims
            kw = urlparse(url)[2].split('/items/__')[1]
            
        return kw if kw else False</code></pre>
======
andrewtj
It's said I once wrote a Perl script to obtain queue times in a call center
that took a screen capture via VNC, then carved up the snapshot into tiles for
each queue and then OCR'd each tile to get the queue time. The time was then
shoved into memcache and the process repeated. I acknowledge nothing.

------
hs
u may want to use regex to reduce / eliminate if checkings

i use newlisp, this is the code:

(set 'urls '(

"[http://motors.shop.ebay.com/Parts-Accessories_Car-Truck-
Part...](http://motors.shop.ebay.com/Parts-Accessories_Car-Truck-Parts-
Accessories__f350-wheels-20_W0QQ_fxdZ1QQ_osacatZPartsQ2dAccessoriesQQ_trksidZm270Q2el1313&caz.html)"

"[http://cgi.ebay.com/ebaymotors/FACTORY-15-Mercedes-E320-300E...](http://cgi.ebay.com/ebaymotors/FACTORY-15-Mercedes-E320-300E-OEM-
Chrome-Wheels-
Rims_W0QQitemZ290277158739QQihZ019QQcategoryZ43955QQssPageNameZWDVWQQrdZ1QQcmdZViewItem&caz.html)"

"[http://shop.ebay.com/items/__mini-cooper-
rims?_trkparms=72%3...](http://shop.ebay.com/items/__mini-cooper-
rims?_trkparms=72%3A543%7C66%3A2%7C65%3A12%7C39%3A1&caz.html)"))

(define (getKW url)

    
    
      (find {([^/|^_]*)(_W0QQ|\?)} url 1) ;find using regex
    
      $1) ;return the first matched string inside (bla*)
    

(map println (map getKW urls))

;f350-wheels-20

;FACTORY-15-Mercedes-E320-300E-OEM-Chrome-Wheels-Rims

;mini-cooper-rims

~~~
matth
The particular hangup I faced when using regex was with this URL:
[http://vi.ebaydesc.com/ebaymotors/ws/eBayISAPI.dll?ViewItemD...](http://vi.ebaydesc.com/ebaymotors/ws/eBayISAPI.dll?ViewItemDesc&item=290277158797&t=1227377775000&ds=2&seller=la-
wheel-and-tire&js=e583:1&hr=http://shop.ebay.com/?_from=R40&caz.html)

>>> x = re.compile('(?:(item|t|hr)=(\d+))')

>>> x.findall(url)

[('item', '290277158797'), ('t', '1227377775000')]

I can't for the life of me figure out how to get the hr value. I started
messing around with regex again, but no luck so far.

I can do the same with Python's string methods:

>>> hr_start = url.find('hr=')

>>> hr_end = url.find('&',url.find('hr='),len(url))

>>> url[hr_start:hr_end]

'hr=<http://shop.ebay.com/?_from=R40'>

It's just messy as hell, it'd be nice to do everything in one swoop.

~~~
matth
Ok, mission accomplished methinks:

>>> x = re.compile("[\\\?&](seller|item|hr)=([^&#]*)")

>>> x.findall(url)

[('item', '290277158797'), ('seller', 'la-wheel-and-tire'), ('hr',
'<http://shop.ebay.com/?_from=R40'>)]

------
sofal
I desperately needed to find a way to change the java.library.path at runtime,
which is technically forbidden.

I finally stumbled upon this beautiful ugly hack (it opens up Sun's non-public
class and hacks it via reflection):
<http://forum.java.sun.com/thread.jspa?threadID=707176>

It even works on Mac OS X.

------
ErrantX
I came up with a quite an ugly hack in Python for the EventScripts plugin
(<http://python.eventscripts.com>).

It was designed to allow you to thread a request for a web page in pure python
and also timeout after a while (because threading is stupidly slow via ESP on
game servers)..

Code: <http://errant.pastebin.com/f3d492f2d>

That is older code but all I can find atm. It has a tendency to crash things
:P

The final code had a lot of time.sleep(0) code in it too to force the threads
to try and grab the GIL. Ugh :P

(also for the record the hacked "kill" extension to threading.Thread I picked
up from elsewhere :))

