
Show HN: An API for filling out PDF forms - nathan_f77
https://formapi.io
======
nathan_f77
Hi HN! I started FormAPI because I've been living in Southeast Asia for the
last few years, and I've had to fill in a lot of visa application and
extension forms. I want to build a service where you can fill in visa forms
online, so I built FormAPI as the first step towards that project.

I also used to work at Gusto (a payroll company), and we had to fill out a lot
of tax forms, which took a lot of work. I know this is something that many
financial and legal startups have to do, so hopefully FormAPI can make this
process a lot easier.

Let me know if you have any questions or feedback!

~~~
konschubert
This is absolutely great and it will enable A LOT of very useful applications.

One question though: I assume that the automatic detection of form fields does
not always work. Especially if the PDF does not exist in a “fillable" format.
Am I right with that interpretation?

~~~
nathan_f77
Hi, at the moment we can only import fillable PDF forms (AcroForms or XFA.)

In the future we might implement OCR and machine learning to automatically
detect fields in scanned or photographed PDFs. I think it could also be useful
to build an index of fillable PDF forms (e.g. from the IRS and UCSIS), and
then use those original fillable PDFs whenever we detect a matching scanned
form. We might also build a public database of fillable forms.

But I don't want to build any of those features unless I have a customer who
actually needs it. (Please send an email to hello@formapi.io if this is
something you would use!)

------
Animats
From the terms:

 _FormAPI is not PCI DSS Certified or HIPAA compliant._

 _You understand that the technical processing and transmission of the
Service, including your Content, may be transferred unencrypted and involve
(a) transmissions over various networks; and (b) changes to conform and adapt
to technical requirements of connecting networks or devices._

Unfortunately, this is a service that sees the data being entered on the form.
If it processed a blank PDF form and sent back something you ran locally to
generate a filled-in form, that would be great. But as a service, it sees too
much user data.

~~~
nolite
agree... this looks amazing, but no way I could ever use it... can't send user
data to a service like this... DOA...

~~~
rublev
What's with the 3 ellipses? A few of my friends text that way and it feels
awkward.

~~~
nolite
It’s standard for unexplicit continuation of a thought...

~~~
rublev
I honestly can't read this without visualizing someone rolling their eyes
after every thought while finishing their sentence sarcastically with a vocal
fry lol.

------
alistproducer2
To the OP: Can I ask what your process was? A project like this obviously
takes a lot of time. How certain were you that you had a product people wanted
before embarking on the dev?

~~~
nathan_f77
I mentioned this in another comment, but I've had to fill in a lot of visa
application and extension forms while living in Southeast Asia. I wanted to
make that process easier, so this was the first step towards that goal.

I've also worked at companies where we built something similar in-house. So
while this is a very niche product, I know there's a least a few developers
out there who may find it useful.

Building something like this is also a great way to find new problems to
solve. I think even if it's a niche idea or something you're not sure about,
it's better to start building something than to wait for the perfect idea.

~~~
alistproducer2
Thanks for sharing that. I think using one's job to identify potential markets
is a great idea.

------
WhitneyLand
This api has inspired this idea:

Ever notice most new health care provider visits still require a ton of paper
work?

Use mobile app to take a picture of their arbitrary form, use OpenCV to
interpret it as as real data structure, autofill it from a personal database,
and print it out on one of those tiny portable printers, and hand it back to
them.

To try and make it viral: When used to process a form, the app also generates
a web site for that office. Then anyone can search for dr. lowtech’s office
and use the site to form and print them at home before coming in.

Each printed form would have a header/footer asking the provider to register
on the site, at which point it could become an official entry point into the
system and leverage other efficiencies. Some hipaa stuff applies but nothing
insurmountable.

I guess everyone else here also has 10 ideas a day pop into their head for a
startup. So on to the next one...

------
jv22222
@nathan_f77: That's an unusually large plans and pricing page:

[https://formapi.io/sign_up](https://formapi.io/sign_up)

I'm not saying that it's bad, but I think it's strongly worth your while A/B
testing that idea because it's quite granular.

You may find that with just 3 plans you would leave less money on the table
over time.

(Of course you would also have a "call for enterprise options" link.)

Another option is (and I did this with pluggio) is to let, say, 50 people in
for a very low special price and then observe exactly how many api calls they
make.

Then based of that data work out your usage bands and magic levers.

Also, I note your only magic lever is usage but I bet there are other levers
you can use to get people from one plan to another.

Anyway, I hope this is useful feedback!

Great job on the site :)

~~~
nathan_f77
Thanks for the great feedback! We definitely need to test various pricing and
sign up pages. And yes, we will be watching how our customers use the service,
and have already discovered a lot of things that we need to improve.

------
bubso
Hey guys,

Not trying to plug our co but its totally relevant here...we built a field
detection algorithm to find the fields in any form. So you can drag n drop
your doc and start filling it in and sign it within a couple of seconds. Feel
free to check it out. Thanks guys!
[https://www.paperjet.com/](https://www.paperjet.com/)

~~~
ocrcustomserver
This is great! How do you detect the fields? Are you using image processing?

------
efitz
Most forms contain personally identifiable information. Some contain extremely
sensitive information, for example the kinds of forms that you describe-visas,
etc.- contain national ID numbers and so forth. The idea of providing this to
a third party web site scares the daylights out of me. Is it possible for me
to obtain the source code on Github and host it myself?

~~~
nathan_f77
I totally agree, these forms can contain very sensitive information. Actually
there is some information that is too sensitive - We are not currently PCI DSS
Certified or HIPAA compliant, so we cannot handle credit card details or
protected health information.

We are very serious about protecting PII. One of the ways we achieve that is
by using battle-tested frameworks and libraries with the default settings
(Rails and devise), and not writing our own code for crypto and security.

By default, we delete generated PDFs and any associated data after 7 days.
This can be configured in the template settings [1], so you can make the
retention period much shorter. You can also immediately delete any submitted
data by making an API request [2]. (Disclaimer: Data may be present in our
automated database backups for up to 2 weeks.)

Finally, FormAPI is not open source, but we can provide a license for a self-
hosted installation. For enterprise plans and on-site hosting, please contact:
enterprise@formapi.io

[1]
[https://formapi.io/docs/template_editor/settings.html#expire...](https://formapi.io/docs/template_editor/settings.html#expire-
submissions)

[2]
[https://formapi.io/docs/api/expire_submission.html](https://formapi.io/docs/api/expire_submission.html)

~~~
blowski
> We are very serious about protecting PII.

Problem is, everyone (from Equifax to Yahoo) says that. If I can't trust a
huge multi-billion corporate, I would certainly be nervous about trusting a
very new startup with much more than my email address.

~~~
nathan_f77
Yes, that is very true.

I am planning to move my hosting to Aptible [1], and will become PCI certified
and HIPAA compliant.

If anyone is interested in FormAPI, but requires PCI certification and HIPAA
compliance, please send an email to compliance@formapi.io. We'll let you know
when we are ready. You can also send an email to enterprise@formapi.io to
inquire about on-site hosting.

[1] [https://www.aptible.com/](https://www.aptible.com/)

~~~
runako
Congratulations on your launch!

On the one hand, I understand the concerns about PII in your app.

On the other hand, I'd be willing to bet there are a ton of line-of-business
apps that don't handle PII that could benefit from this (purchase orders, B2B
shipping forms, etc.).

Unless you have investors, I would suggest waiting until you get >100 paying
customers to find out whether you need to pay the premium for PCI/HIPAA
hosting.

~~~
nathan_f77
Thanks for the feedback @runako, yes PCI and HIPAA compliance is very
expensive. I will need to hear from more customers who need this level of
security and compliance before I can afford to make the jump.

------
fetch1
This looks really good and the demo is really effective. Best of luck to you!

Was DocRaptor a big influence on how you priced it?

Shameless plug: I've also recently been working with PDFs - a PDF template
creator and API [0] - but I have a different use case in mind.

[0] [https://fetchpdf.com](https://fetchpdf.com)

~~~
konschubert
Hi I like this.

Are you aware of [https://www.docmosis.com/](https://www.docmosis.com/) ?

How do you compare to them?

~~~
fetch1
I wasn't aware of them, no. They're definitely more established and have more
features.

The goal of FetchPDF was to let web applications give their users a way to
create and edit PDF templates in the browser. e.g. to customise system
templates for invoices, certificates etc.

I still have a lot of work to do to, it would seem.

~~~
konschubert
Don't let it discourage you. They have their own set of problems, the first
one being that they use libre office for templates, a piece of software that
not everybody likes or is familiar with.

I've been a happy customer of docmosis, but I'm still interested in using your
service if it provides easier templating.

~~~
fetch1
Please feel free to sign up for the trial.

Send through an email to the support email address and I'll upgrade the trial
and set an API key without you needing to put in billing details.

But be warned - it's still in MVP stage and may disappoint! (It also means
that it's very receptive to feedback :)

~~~
konschubert
All right, I signed up :)

The big selling point for me is that the template generation is right there.
It is much easier to explain to people than saying: All right, so you download
libre office, create a document, upload it here, name it like this or that...

------
mariusz331
Hey Nathan! Great to see another product in the PDF space. I see a few
similarities to PDF Otter[0] and was wondering if you drew any inspiration
from it.

[0] [https://www.pdfotter.com](https://www.pdfotter.com)

~~~
nathan_f77
Hey, I did come across PDF Otter a few weeks ago when I found your Show HN
post [1]. There was some great feedback in the comments. I started building
FormAPI a few months ago, and should have done more market research!

[1]
[https://news.ycombinator.com/item?id=14805750](https://news.ycombinator.com/item?id=14805750)

~~~
mariusz331
No worries! Feel free to email me if you want to collaborate: Mariusz at PDF
Otter dot com

------
jonaldomo
Looks good. I think some real life use cases would help on the sales site.

So the use case is you have a company that only accepts PDFs as an input and
you have a digital data set that you need to get each individual row into one
PDF?

How do you handle signatures?

~~~
stult
Not OP but one use case is charitable grants. I made an app for a homeless
agency my girlfriend used to work at for automatically filling out PDFs. They
spent an inordinate amount of time filling out grant application PDFs, most of
which requested essentially the same info. We made a simple WPF app that let
them tag each field on the PDF as a particular info type (e.g. address, or
address line 1, or statement of organization's purpose) then click a button to
autofill. They keep all the data in a single underlying spreadsheet for
convenience and just update that, rather than treating each of their hundreds
of applications as bespoke.

Also I work at an accounting firm and we have a home grown solution that does
precisely this for all of the standard IRS forms we fill out. For a larger
firm like mine, we can afford to have someone code the solution in house, but
for smaller firms or for in-house accounting departments, this could be
worthwhile, depending on the price point.

------
asteinbr
There exists also php-pdftk ([https://github.com/mikehaertl/php-
pdftk](https://github.com/mikehaertl/php-pdftk)). Great software for filling
pdf forms.

------
steveneo
I like this service as well:
[https://www.platoforms.com](https://www.platoforms.com).

------
tyingq
Would be interested to hear a bit about the internals. Which PDF libraries you
tried, which one you ended up using, why, etc.

~~~
nathan_f77
I built the site using Ruby on Rails, and ended up trying all the PDF
libraries I could find. In the end I had to use six different libraries and
tools to solve various problems with parsing or transforming PDFs.

The libraries I use are: pdf-reader [1], prawn [2], prawn-templates [3],
origami [4], qpdf [5], and pdftk [6].

There's a new PDF library for Ruby called hexapdf [7], and it looks really
promising. I think it might even be able to replace all six of these
libraries, but it's released under the AGPL license, and commercial licenses
are not available yet.

[1] [https://github.com/yob/pdf-reader](https://github.com/yob/pdf-reader)

[2] [https://github.com/prawnpdf/prawn](https://github.com/prawnpdf/prawn)

[3] [https://github.com/prawnpdf/prawn-
templates](https://github.com/prawnpdf/prawn-templates)

[4] [https://github.com/gdelugre/origami](https://github.com/gdelugre/origami)

[5] [http://qpdf.sourceforge.net/](http://qpdf.sourceforge.net/)

[6] [https://www.pdflabs.com/tools/pdftk-the-pdf-
toolkit/](https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/)

[7] [https://hexapdf.gettalong.org/](https://hexapdf.gettalong.org/)

------
chromano
Will there be an API sometime in the future? Also will it be possible to
whitelabel the template editor and/or make it embeddable?

------
gowan
this is an interesting idea. i like how it is declarative.

acrobat FDF[1] and Apache PDFBox[2] provide something similar but are not
declarative. i'd be interested in seeing an approach with mozilla/pdf.js ...
though i'm not sure how usable it would be.

[1]
[http://www.adobe.com/devnet/acrobat/fdftoolkit.html](http://www.adobe.com/devnet/acrobat/fdftoolkit.html)

[2] [https://pdfbox.apache.org/](https://pdfbox.apache.org/)

------
kapauldo
Is this web form to pdf or pre existing fillable pdf to json?

~~~
nathan_f77
FormAPI converts any existing PDF into a template that can be filled out via
an API call, or via an online form.

If you upload a scanned PDF (e.g. plain images), then you can add form fields
in the template editor. If you upload a pdf with a fillable form, then we will
automatically import all of the existing fields (and the imported fields can
then be modified or removed.)

~~~
reboog711
I'm really confused. Is this intended as a programmer's tool? Or as an end
user tool?

If I understand it, create an HTML form, submit to server, spit back PDF. This
has been built into ColdFusion for about a decade, and it is trivial. [and I
assume other server side software can do the same thing].

Aside from that, the bulk of the PDF Forms I download these days can already
be filled out with Acrobat. I'm not sure what benefit you're offering me in
time savings.

~~~
nathan_f77
This is a tool for programmers who need to fill out a lot of PDFs
automatically. For example, a freelancing service could use FormAPI to fill
out W-9 forms for contractors.

Or if I built a service for end-users to fill out and sign PDFs, I would
create a separate website and use FormAPI to generate the PDFs. (I don't think
I will do this, though.)

You're right that if you just need to fill in a single form, then there are
easier ways to do that. (Even just using the Preview app on Mac).

~~~
reboog711
If I download a W9 from the IRS web site I can fill it out directly using
Acrobat. I'm pretty sure the free reader is all you need.

I guess I'm not understanding the niche you fill.

Good luck with it, though.

------
Tijdreiziger
Why exactly is this a web API? Don't get me wrong, it's great work, but it
seems like something that would be much better suited to a library.

~~~
ateesdalejr
He can make money off of a web API. A library would be hard to keep closed
source.

------
plicense
Fantastic work! The UI looks really slick! How long have you been working on
this project?

~~~
nathan_f77
Thanks! I've been working on this for about 2 months.

------
rublev
I've got formy.io if you want it. Was building something similar way back.

------
ocrcustomserver
Cool project!

Shameless plug: if you need the opposite, i.e. getting text/structured data
out of filled PDF forms (or any kind of PDF document), feel free to contact
me.

------
rokhayakebe
Emailed you Nat (when you get a chance).

------
taariqlewis
How does this handle user signatures?

~~~
nathan_f77
There is currently no signature support on the online forms, but that's
something I would like to add soon.

When generating a PDF via an API request, you can add an image field, and
upload your user's signature as an image. You could also create a faux
signature by using a text field with a handwriting font (The Dancing Script
[1] font is available.)

This is a tool for developers to fill out PDFs, so in your own application you
could use something like the signature_pad library [2], and then pass the
signature images to FormAPI.

[1]
[https://fonts.google.com/specimen/Dancing+Script](https://fonts.google.com/specimen/Dancing+Script)

[2]
[https://github.com/szimek/signature_pad](https://github.com/szimek/signature_pad)

------
jaequery
after you fill out the form, does this return back a PDF to you with fields
filled out?

~~~
nathan_f77
Yes, when you submit the form, we fill out the fields, and then you can
download the generated PDF.

~~~
jaequery
will the generated PDF be in the original format of the PDF?

~~~
nathan_f77
Yes, we return the original PDF with the fields filled out. However, for PDFs
with "fillable forms", we remove the forms, so the PDF is no longer editable.

