Hacker News new | past | comments | ask | show | jobs | submit login

Over a decade ago I helped quite a few people migrate their forums off places like Proboards, ActiveBoards, and many other "free" forum hosts to their own hosts using phpBB/SimpleMachinesForum etc.; many such hosts had highly customized forum software and no ability to download the database in any usable format. Copies of my converters might still be floating around on the Internet. At least one of these free hosts used something fairly similar to vBulletin, IIRC.

The process is in principle not difficult: scrape the site (I recommend a dedicated scraper for that), then go through and extract everything relevant into a SQL database formatted the way your target forum software expects. The hardest part was recovering BBCode formatting in a usable fashion. Unfortunately my converters were written back when I didn't understand HTML parsing terribly well, so they're a hodgepodge of ugly regexes and handrolled string parsing.




Modern HTML parsers are still a hodgepodge of ugly regaxes and hand rolled string parsing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: