Hacker News new | past | comments | ask | show | jobs | submit login

Not an oversight — just a different use case for the data! And I wasn't sure if BigQuery had a generic approach here, but it looks like not.



Btw, I really like your query.

I modified it to get the story for up to 7 levels of recursion:

  SELECT p0.id, s.id, s.title, level
  FROM (
    SELECT p0.id, p0.parent, p2.id, p3.id, p4.id, COALESCE(p7.parent, p6.parent, p5.parent, p4.parent, p3.parent, p2.parent, p1.parent, p0.parent) story_id,
           GREATEST(IF(p7.parent IS null, -1, 7), IF(p6.parent IS null, -1, 6), IF(p5.parent IS null, -1, 5), IF(p4.parent IS null, -1, 4), IF(p3.parent IS null, -1, 3),
                    IF(p2.parent IS null, -1, 2), IF(p1.parent IS null, -1, 1), 0) level
    FROM    [fh-bigquery:hackernews.comments] p0
    LEFT JOIN EACH [fh-bigquery:hackernews.comments] p1 ON p1.id=p0.parent
    LEFT JOIN EACH [fh-bigquery:hackernews.comments] p2 ON p2.id=p1.parent
    LEFT JOIN EACH [fh-bigquery:hackernews.comments] p3 ON p3.id=p2.parent
    LEFT JOIN EACH [fh-bigquery:hackernews.comments] p4 ON p4.id=p3.parent
    LEFT JOIN EACH [fh-bigquery:hackernews.comments] p5 ON p5.id=p4.parent
    LEFT JOIN EACH [fh-bigquery:hackernews.comments] p6 ON p6.id=p5.parent
    LEFT JOIN EACH [fh-bigquery:hackernews.comments] p7 ON p7.id=p6.parent
    HAVING level=0
    LIMIT 100
  ) a
  LEFT JOIN EACH [fh-bigquery:hackernews.stories] s
  ON s.id=a.story_id

(having so many left joins consumes a lot of resources, so to run it massively I would look for a different strategy)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: