
XPath tips from the web scraping trenches - predius
http://blog.scrapinghub.com/2014/07/17/xpath-tips-from-the-web-scraping-trenches/
======
patrickg
There are some things in XPath that I really like. For example `x = ( a, b, c
)' means: `(x == a or x == b or x == c)' which is quite handy sometimes. Also
`(a,default)[1]' makes it easy to provide a default if a is null.

~~~
eliasdorneles
Thanks, that is novelty to me! I don't seem to get an example working, though.
Is that XPath 2.0?

~~~
patrickg
Yes, this is XPath 2.0. I don't bother with 1.0 if I can, because it is soooo
far behind 2.0.

data: <data attr="1" />

xsl:

    
    
        <?xml version="1.0" encoding="UTF-8"?>
        <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    
          <xsl:template match="/data">
            <one>
              <xsl:value-of select="@attr = (2,3,4)" />
            </one>
            <two>
              <xsl:value-of select="(@doesntexist,2)[1]" />
            </two>
          </xsl:template>
        </xsl:stylesheet>
    
    

gives: <one>false</one><two>2</two>

~~~
eliasdorneles
That is very cool, indeed! =) I'm stuck with XPath 1.0 still, because XPath
2.0 hasn't found its way to lxml yet -- it seems to be quite a lot of work to
implement it.

------
krapp
I've been working on a web scraper that uses xpath for a while as part of
several projects - resources like this are very much appreciated.

