
A PostgreSQL Extension that allows you to access any other datasource - treo
http://multicorn.org/
======
rdunklau
Multicorn developer here, I'll be glad to answer any question.

~~~
wulczer
Crap, I'm working on something exactly like this :D Was about to upload it to
Github just today, but still wanted to fix a few things.

Yours looks quite more finished, one thing that would be very useful is
handling Python to Postgres type conversion, like PL/Python does. After that,
the next step is caching I/O functions for the duration of the scan, which
PL/Python also does.

Nice to see that the idea made sense for more than one person, hope Multicorn
will rock!

~~~
rdunklau
Sorry to beat you at it ;)

We currently have some very rudimentary python to postgres type conversion,
but this area still need a lot of improvements.

You should release your code, I'm sure you have a wide range of ideas worth
merging into Multicorn !

~~~
wulczer
Actually, the ideal thing to happen would be for PL/Python to somehow become a
shared library that provides the typecasting routines, but I'm afraid that's a
very big task and difficult to get accepted upstream. It's sad that PL/Python
already includes all the code you need to transform Python lists to Postgres
arrays, bools to PG bools and so on and every project like this one has to
duplicate all that logic.

BTW: in my approach I opted for passing the entire Qual node as a string
(using nodeToString) and my plan was to write a Python parser for that format
that takes a string and returns a Python object. Not sure how hard that will
turn out.

~~~
rdunklau
I totally agree with you regarding PL/Python. I may try to ask upstream what
they think about it.

------
leopard
I try to install this extension on Ubuntu 11.10. First of all error in
instalation -

gcc -g -O2 -fPIC -fPIC -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wformat-security -fno-strict-
aliasing -fwrapv -g -lpythonpython2: -fpic -L/usr/lib -Wl,-Bsymbolic-functions
-Wl,--as-needed -Wl,--as-needed -Wl,--as-needed -lpythonpython2: -shared -o
src/multicorn.so src/multicorn.o /usr/bin/ld: cannot find -lpythonpython2:
/usr/bin/ld: cannot find -lpythonpython2: collect2: ld returned 1 exit status
make: __* [src/multicorn.so] Error 1 rm src/multicorn.o ERROR: command
returned 2

This is because Ubuntu have - python and python2.7 bin files. I fixed this by
creating symlink on python.

After instalation another problem:

$ psql psql (9.1.1) Type "help" for help.

leo=# CREATE EXTENSION multicorn; ERROR: could not load library
"/usr/lib/postgresql/9.1/lib/multicorn.so":
/usr/lib/postgresql/9.1/lib/multicorn.so: undefined symbol: _Py_NoneStruct

All tested on specialy created for this system: $ python -V Python 2.7.2+ $
psql -V psql (PostgreSQL) 9.1.1 contains support for command-line editing

On this my tests end.

~~~
rdunklau
Thank you for this report. I'll try to make it work on Ubuntu, expect the
documentation to be updated in the next few days.

PS: if you're the 'leopard' who requested a redmine account, it should be
activated now, feel free to report it there.

~~~
leopard
Ok, I already added this to tracker. I am very interesting in this extension,
its amazing.

------
muyuu
I have well over 10K messages in gmail.

Will this work?

I was thinking in downloading the whole thing via POP3, creating a Unix
mailbox and indexing that.

~~~
rdunklau
It depends on what your use case is. If you have well targeted queries (eg,
with filters on "From" or "Subject" headers, and only query the message
headers (not the payload) you can have a pretty good performance.

What is your use case, exactly ?

~~~
muyuu
Most of my filtering would be based on the contents of the subject header. I
have a site that sends me copies of particular transactions, and these few
thousand messages are the ones I'd like to query. Ideally I'd like to have
them downloaded for statistical processing.

------
Jach
I was at OSCON Data this year that held a talk about the new foreign data
connectors piece, it's great that Postgres has them now. Want twitter data?
Make an interface to translate twitter data to rows, it's that easy! LucidDB
has had foreign data connectors for a while but you have to use Java at least
for part of it. I like the generator-style of yielding list-rows shown here:
[https://github.com/Kozea/Multicorn/blob/master/python/multic...](https://github.com/Kozea/Multicorn/blob/master/python/multicorn/csvfdw.py)

