Hacker News new | past | comments | ask | show | jobs | submit login

Usually for dynamic pages, it's still working off of some internal API. So when I search for first name starts with "a", it's sending a POST request to https://efdsearch.senate.gov/search/report/data/ with the following form data, which returns a nice JSON response.

Obviously you'd have to handle with correct headers and CSRF tokens, but it'll be easier than Selenium for sure.

  draw: 1
  columns[0][data]: 0
  columns[0][name]: 
  columns[0][searchable]: true
  columns[0][orderable]: true
  columns[0][search][value]: 
  columns[0][search][regex]: false
  columns[1][data]: 1
  columns[1][name]: 
  columns[1][searchable]: true
  columns[1][orderable]: true
  columns[1][search][value]: 
  columns[1][search][regex]: false
  columns[2][data]: 2
  columns[2][name]: 
  columns[2][searchable]: true
  columns[2][orderable]: true
  columns[2][search][value]: 
  columns[2][search][regex]: false
  columns[3][data]: 3
  columns[3][name]: 
  columns[3][searchable]: true
  columns[3][orderable]: true
  columns[3][search][value]: 
  columns[3][search][regex]: false
  columns[4][data]: 4
  columns[4][name]: 
  columns[4][searchable]: true
  columns[4][orderable]: true
  columns[4][search][value]: 
  columns[4][search][regex]: false
  order[0][column]: 1
  order[0][dir]: asc
  order[1][column]: 0
  order[1][dir]: asc
  start: 0
  length: 25
  search[value]: 
  search[regex]: false
  report_types: []
  filer_types: []
  submitted_start_date: 01/01/2012 00:00:00
  submitted_end_date: 
  candidate_state: 
  senator_state: 
  office_id: 
  first_name: a
  last_name:



That works for the search results, but the pages to display the periodic transaction reports embed the data directly into tables. Here's an example: https://efdsearch.senate.gov/search/view/ptr/8db16cde-8a14-4...

So I think we'll still need to parse from the HTML. But it might still be cleaner than using Selenium, so I'm open to any code changes.


Ah got it. Yeah seems like that's direct HTML.

It would make the script faster and less error prone for sure. I'll check out the repo in depth and see if there are any low hanging fruit.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: