
Wolfram Alpha’s New Data about Pokémon - ndrake
http://blog.wolframalpha.com/2013/10/10/gotta-compute-em-all-wolframalphas-new-data-about-pokemon/
======
carlob
I think it's pretty cool that you can actually compute stuff about the data
you extract from these queries

[https://www.wolframalpha.com/input/?i=linear+regression+of+h...](https://www.wolframalpha.com/input/?i=linear+regression+of+height++vs+attack+of+electric+pokemon+vs+all+pokemon)

------
_frog
It would be extremely cool if they'd expose information such as what moveset
each Pokémon has available to them so I could make queries like 'what water
type Pokémon that can learn Hydro Pump has the highest Special Attack stat'.

~~~
mappu
Maybe consider scraping bulbapedia into an SQL database? :)

    
    
        -- i'm dreaming
        SELECT pokemon.name
        FROM
          pokemon
          JOIN pokemon_learn_moves ON pokemon.id = pokemon_learn_moves.pokemon_id
          JOIN move ON pokemon_learn_moves.move_id = move.id
        WHERE
          move.name_en = "Hydro Pump"
        ORDER BY pokemon.sp_atk DESC LIMIT 1;
    

Actually, from this perspective there's a lot of boilerplate, no wonder people
like key-value stores... Then use python-nltk to make an english wrapper (and
say goodbye to your free time for the next month!)

~~~
_frog
I've thought about doing this before actually, don't know how they'd feel
about someone scraping their content though.

~~~
mappu
It's not like bulbapedia own the fundamental content - do they place a public
license on the wiki pages?

Ideally bulbapedia would provide mediawiki dumps for this, but they don't, and
they've gone on record saying they don't intend to. They did leave the
mediawiki API open though if you want to crawl a clean rip of each page's
wikitext - the default mediawiki API guidelines are also intact, which say
that single-threaded crawls should be acceptable in almost all instances, but
you should warn the site owner before initiating a multi-threaded scrape.

~~~
_frog
It looks like all of Bulbapedia's content is licensed under the Creative
Commons Attribution-NonCommercial-ShareAlike license.
[http://bulbapedia.bulbagarden.net/wiki/Bulbapedia:Copyrights](http://bulbapedia.bulbagarden.net/wiki/Bulbapedia:Copyrights)

------
officemonkey
If only "pikachu versus bulbasaur" suggested the best strategy for each. :-D

~~~
recuter
Umm, that's a very unbalanced matchup, bulbasaur wins every time. Water >
Electricity. If I recall, even a Raichu would lose to bulbasaur.

Probably shouldn't be posting this. :P

~~~
mtinkerhess
But Bulbasaur is grass / poison type. Electric is not very effective against
grass, so Bulbasaur will have an advantage unless Pikachu avoids using its
electric attacks.

~~~
aspensmonster
IIRC, neither one has an advantage or disadvantage against the other. It'd
actually be an even match!

Edit: According to [http://pokemondb.net/type](http://pokemondb.net/type) ,
when Grass is attacking Electric, the attack is "normal," 100% of the damage
takes. However, when Electric is attacking Grass, the attack is "not very
effective," and only 50% of the damage takes.

So Bulbasaur is totally going to win this.

------
NicoJuicy
Always liked WolframAlpha, this is also awesome.

But what i really like, is to match stuff like protein in bananas vs spinach
to know what i'm going to (prefer to) eat soon ^^..

So, thanks!

~~~
scott_karana
Wow, I didn't realize it had food. Good call! :)

~~~
NicoJuicy
Yeah, no problem, just shared some awesomesauce

------
joshfraser
The technology behind WolframAlpha is truly incredible. It's likely one of the
most unvalued resources of our day.

~~~
jnazario
yes and no. it's a tremendous amount of work to organize information and build
associations between what you ask and what you get, although it certainly has
a lot of gaps. it's also a neat way to look at how to combine information.

that said it's within grasp of many of us: natural language interfaces (NLI)
and SPARQL databases and endpoints. have a look at this semanticweb q&a:

[http://answers.semanticweb.com/questions/12747/natural-
langu...](http://answers.semanticweb.com/questions/12747/natural-language-to-
sparql)

some good links in there. basically find your SPARQL endpoints, have a list of
synonyms mapped between your inputs (which you parse with NLP tools like weka
or the stanford parser, or even python's nltk) and map your query to your
ontologie(s) from your endpoints. then try successive answers.

a good, simple interface to play around with that is quepy:

[http://quepy.machinalis.com/](http://quepy.machinalis.com/)

a few others exist.

hope that helps. despite challenges in the adoption rates of the semantic web,
i think it's the future of information retrieval because it makes sense for us
as users and truly organizes information.

~~~
taliesinb
Some of our stuff is simple database lookup (ala Google knowledge graph),
other stuff is more algorithmic and computational in nature.

The problem we've had with SPARQL and co is that we feel it isn't optimized
for computational queries. Ontologies don't matter as much in that case, and
inference in tuple stores costs you significantly in performance, although the
technology is improving.

As often as not, however, the computationally irreducible work lies in making
a domain suitable for computational consumption, not in the technology used
for representation.

To analogize, UTF-8 is great, but without the notion of Unicode code points it
wouldn't exist.

------
ecto
How would one generate this equation from an image?
[http://www.wolframalpha.com/input/?i=snorlax+plane+curve](http://www.wolframalpha.com/input/?i=snorlax+plane+curve)

~~~
zalzane
I've never done it, but here's how I would do it off the top of my head.

-Vectorize the image into a set of bezier curves that have endpoints at intersections

-Walk the bezier curves so that the entire image can be represented by a single "strip" of piecewise bezier sections. If there's any overlapping sections from having to rewalk the a curve, maybe apply some offsets/small modifications to those curves to give it a "sketched" look.

-Rasterize the bezier curves in order at your desired resolution into a list of XY coordinates. Make sure the XY coordinates remain in order they were rasterized.

-Split the coordinates of each rasterized pixel so that you have two lists, (t, x) and (t, y), where t is the index of the rasterized pixel. Now you can represent the X and Y coordinates of your sketch on two coordinate planes as a function of t.

-For each coordinate plane X and Y, record the index where the second derivative of the rasterized graph changes. What this does is lets you distinguish subcurves on the line. For this to work it's probably important that the rasterization was done at a relatively high resolution.

-For each subcurve, generate an extremely low frequency trigonometric function that can represent that subcurve. Ideally outside of that subcurve, the trig function should be at or very near zero. This might require layering some additional trig functions in order to eliminate any noise.

-With the trig functions generated, return the results as a parametric function.

------
unknownian
I tried Xerneas (new legendary) but it is still in the process of researching.
Also, for those who haven't played pokemon in a while, it's much more than
just type match-ups, no matter what the anime leads you to think. There is
much more strategy involved as it is a somewhat detailed RPG.

------
greyfox
pwnt~ [http://www.wolframalpha.com/input/?i=porygon-
like+curve](http://www.wolframalpha.com/input/?i=porygon-like+curve)

------
tharshan09
This is pretty neat. Anyone have any ideas about how they obtained such
complete data? Is it really from scraping the other sites mentioned in the
threads?

------
bitemix
If they did this for League of Legends characters and build calculations, I'd
never leave the site.

~~~
KnightHawk3
That is actually a really good idea for a website.

I should attempt it as a hobby sometime.

------
jwr
I'd be much more interested in WolframAlpha merging data about StarCraft II.

There is a lot of computing going on there, DPS (Damage Per Second) possibly
modified by upgrades and enemy type, etc. This data could be genuinely useful
to the SC community.

~~~
benvds
And when including timings you could easily calculate and optimize build
orders.

------
hayksaakian
Does this mean that Siri can now answer questions about pokemon?

------
cbhl
Ironically, the new dataset doesn't seem to contain any of the Pokémon
introduced in X or Y (yet).

------
cmiller1
I entered in "(the pokedex number of geodude)*5"

Expected result: 370 Result: 74

------
hiddensanctum
All I have to say is that is awesome

------
thealphanerd
I love the pikachu-like curve

~~~
Cookingboy
I just bursted out laughing imagining people saying that as a pickup line.

------
pyrocat
That's... cool I guess? There are a bunch of other sites that already do this
though. Serebii, bulbabedia, veekun, pokemondb .etc

~~~
scott_karana
Those other sites allow you to plot distribution and find sets of specific
numerical characteristics that easily? I only remember seeing generic
statistic listings.

~~~
omnipath
[http://veekun.com/dex/pokemon/search](http://veekun.com/dex/pokemon/search)
does all that Alfram is claiming to do, other than the grouping, which to be
honest, I'm not sure how useful that'll be, other than statical notekeeping.

------
jseip
New pickup line: I maintain the Pokemon database for Wolfram Alpha
#chicksdignerds #notreally #hopeforIPOmoney

