
Show HN: Explore 16 Years of Green Card Applications - negrit
http://data.jobsintech.io/green-cards
======
negrit
Hello everyone. I quickly built this little tool based on public records
provided by the government. I did the same a couple weeks ago with H-1Bs and I
just added over 1.1million green cards records.

The data provided is not perfect so i'm still working on cleaning it but it
should give you an idea of what is going on.

If you have any questions, feel free to ask.

~~~
bootload
_" If you have any questions, feel free to ask."_

What tools did you use/craft to clean the source data?

~~~
negrit
PostgreSQL :)

~~~
bootload
so just dump the data in and process via sql. What about pre-processing ie:
different data sources (pdf, cvs), incomplete or overlapping data? I imagine
some code had to be written to do this?

~~~
negrit
All my datasource where csv and mdb files. So, I directly imported them to
PostgreSQL. Very little code was written to clean the data. 99% of the
cleaning was done with SQL queries.

The code is mostly used to display the data. Cleaning the data with SQL
queries is much faster than writing code.

------
vowelless
Just as an FYI, citizenship is not the same as country of chargeability
(usually country of birth), which is what the USCIS looks at for placing you
in EB-ROW/India/China/Mexico/Philippines. So you could have an Indian passport
but if you are born in, let's say, Saudi Arabia, you are placed in EB-ROW.

~~~
hpagey
Thats correct. Also, if your spouse is born in a different country than you,
you could charge your application to that country. For example, if your spouse
if born in Kenya but you are born in India you can charge your GC app to Kenya
which is ROW ( Rest of World).

------
thrwy10
_What is this? This website indexes all available LCAs from 2001 to 2015.
Where does the information come from? LCAs are public records and provided by
the "Office of Foreign Labor Certification"._

IMHO, this is misleading. LCAs don't have a 1-1 corelation to 'Green Cards'.
This data is based on the PERM process which is just 1 stage in the green card
process. LCAs aren't green card applications, but a labour market test based
on which green cards are applied for.

~~~
klipt
LCAs (labor condition applications) are for H1Bs, not green cards. Did you
mean LCs (labor certifications)?

~~~
tricolon
I hope he did mean LCs. It's easy to confuse the two:
[http://en.wikipedia.org/wiki/Labor_certification#Differences...](http://en.wikipedia.org/wiki/Labor_certification#Differences_with_Labor_Condition_Application)

------
ausjke
this is roughly consistent with what I know, that India engineers took nearly
half of all H1Bs, while its major peer China is taking less than 10%.

I could never figure out what's going on to make such a big gap, I would think
each takes roughly 15~20% makes more sense.

One theory is that India IT giants are applying for lots of H1Bs then filling
them when they're approved, they know the system too well. Meanwhile the
Chinese IT workers/students don't have those group-effort.

Also India managers like to hire Indians, while Chinese does the opposite,
over time that also make a big difference.

No bias, just curious here. Nice work indeed.

~~~
cesarbs
I work at Microsoft and the number of Indians there seems disproportionate to
me. I too wonder why so many of them come to the US. Maybe IT education in
India is really strong?

Two funny things I observe regularly:

1) Sometimes I take an MS shuttle to work. The shuttle is full, and I'm the
only non-Indian in it.

2) Sometimes I'm in my building's lobby and when I look around, I'm the only
(or one of 2-3 people) non-Indian there.

I pity those guys because they have to wait for 10+ years to get their Green
Cards, even if they're EB-2.

~~~
a8da6b0c91d
> Maybe IT education in India is really strong?

More like systemic visa sponsorship fraud for reasons of cost saving. The visa
candidates are often as not frauds as well:
[http://www.dawn.com/news/1080040](http://www.dawn.com/news/1080040)

------
chrissnell
The 'salary' feature is pretty telling. I just looked up a former coworker by
searching for the company and narrowing down by the hire year. I knew what
country he was from so I was able to see his (starting) salary.

------
parennoob
Wow, I would really like to see my American coworkers' starting salaries the
same way they are able to look mine up in this table :)

Don't think that would ever happen though, it would be labeled as a shocking
breach of privacy.

~~~
bruceb
You can look any most US citizens' salary if they work for a gov't agency.

~~~
parennoob
I know that.

But I work for a private company, and I would bet the majority of HN readers
do too. Imagine if your starting salary was known to your coworkers on the day
you started, but not vice versa.

------
yitchelle
Can somebody explain to me why Americans need to have green card to work in
the US? According the the data, there are 1659 applicants with a success rate
of 68%.

~~~
DrJokepu
Also, the Soviet Union was dissolved in 1991, how come there were 4 USSR
nationals between 2001 and 2015 applying for permanent residency?

~~~
negrit
Ah, I should have made it more clear. It's the decision date, note the
application date. So probably those applicants applied when the USSR was still
a thing.

~~~
lazaroclapp
So, in essence, the bureaucracy took so long that at some point during the
process, their country ceased to exist. Kafka would be proud :)

------
rb2k_
Thanks for the work to present that data in a nice and explorable way!

It would be awesome to have another column next to "% of Greencards" that says
"% of population".

One of the reasons India has by far the most green cards is that they also are
up there when it comes to total population.

~~~
i_have_to_speak
The site says India 40.1% and China 6.7%. According to the CIA factbook, the
respective populations are 1.2 billion and 1.3 billion.

~~~
Flammy
The "% of Green Cards" refers to percent of Green Card applications in
database, not against population of the country.

~~~
rb2k_
yes, and that's why I'd love to have another column to figure out if there are
any countries that have statistically significant higher/lower Greencards per
capita :)

It's probably not a super useful metric, but it would be interesting to see

------
npalli
Is there some filter on this data? Green Card for a certain type of job? India
has 280K greencards and Mexico only 40K greencards in 16 years? That doesn't
seem right.

~~~
klipt
I assume this is only employment green cards. Most Mexican Americans probably
come in on family based green cards, or just got citizenship by birth in the
US.

~~~
npalli
Even China has only 45K greencards. They seem to follow the same pattern as
India for immigration, so what category are they coming in. Last I checked
both countries (India, China) have similar number of immigrants in the US.

~~~
klipt
I don't know, but the EB2 and EB3 backlogs for India are longer than for China
so there must be some difference.

Assuming those numbers are from PERM applications though, they don't
correspond one-to-one to immigrants. For example Indians have a multi year
wait to get a green card after their PERM is approved, and anytime they change
employers while waiting they will likely apply for a new PERM, so by the time
they get their green card one Indian could have gone through many PERM
approvals (which this website will probably count separately).

------
slipjack
As mentioned in one of the other comments, you're getting NaN% for some
success rates. I'm guessing it's because you're using the number of certified
as the denominator, which will return NaN if it's 0.

Also, maybe this is a dumb question, but is this all green cards? For example,
your site says there were 261 green card recipients from Nigeria, which seems
quite low.

(Goes without saying - really cool!)

~~~
markdown
> Also, maybe this is a dumb question, but is this all green cards? For
> example, your site says there were 261 green card recipients from Nigeria,
> which seems quite low.

I wondered the same thing. Fiji is listed with 2 recipients (last I checked
Fiji's annual quota was 600).

------
rvdm
Excellent way of drawing attention to the subject! Kudos.

Might be interesting to explore the option of adding a beautiful data vis on
the home page to capture the user's attention from the get-go. Looking at the
sub pages you're obviously skilled enough at creating fascinating
visualizations.

~~~
jameshart
Curious: what sort of attention are you hoping this brings to what subject?

------
bradleyjg
Are these across all LPR categories—i.e. employment based, family based, DV,
asylum, etc.? The numbers make me inclined to think not, but I can't find any
explanation on the website of what it covers.

~~~
klipt
Since it has salary data it must be based on PERMs, which only cover a subset
of employment green cards.

------
chdir
It'll be fun to map the stock price with the hiring graph. Does a big jump in
hiring signal an oncoming price surge in the next year or so?

------
awjr
If you ever get the chance I would consider redoing the graphs using
[https://dc-js.github.io/dc.js/](https://dc-js.github.io/dc.js/) Being able to
interact with data by clicking on graphs can make a massive difference to the
way you can interrogate data. Kudos though.

~~~
negrit
This is awesome! I'm def considering it.

------
scrumper
Search fields don't appear to work properly (Safari on OS X) - if you key in a
full search (title, city, state) and hit 'refine' it puts the title in all
three fields while showing you everything, not just search results.
Interesting site though - I want to find me :)

~~~
negrit
On which page?

~~~
scrumper
Go to a specific country, then a year. Try searching for a city or a state, it
doesn't work.

I can't now reproduce the earlier behavior I saw where a job title ended up in
all 3 search fields.

------
bechampion
[http://data.jobsintech.io/green-
cards/united%20states%20of%2...](http://data.jobsintech.io/green-
cards/united%20states%20of%20america?refine%5Bexact%5D=0&refine%5Bjob_title%5D=system+administrator)

~~~
vinay427
Getting an 'NaN' for the Success Rate in the table at the bottom of that page.
May be something for OP to look into.

------
mbellani
So reading into India's stats, there was dip in 2013 followed by a rise to
maximum GC applications by indians ever? Is there a way to tell how many of
those were upgrades? that would be interesting because then you can tell how
many might just be repetitions.

~~~
demodifier
I also wonder what happened to Indian applicants in 2012[0]. A 0.03% success
rate.

[0] - [http://data.jobsintech.io/green-
cards/india/2012](http://data.jobsintech.io/green-cards/india/2012)

~~~
govindkabra31
The dates in this dataset are "decision" date, not the application date.

In that year, there was a fast forward movement in backlog of India's GC
application... Green cards have a per country cap-- at the end of fiscal year,
USCIS can use the "unused" GC quota of other countries for the backlogged
countries. This is why you see that "priority date" for India stays at a fixed
place (e.g., 2005 for EB2 India) for most of the year, and then suddenly moves
forward at the end of fiscal year.

------
slater
Awesome!

Nit-pick: Any chance you could right-align the columns with numbers, and add
thousand delimiters?

------
peter303
Always interesting to look up your own company for activity. You may be
surprised.

------
username3
What's the difference between salary and prevailing wage?

[http://en.m.wikipedia.org/wiki/Prevailing_wage](http://en.m.wikipedia.org/wiki/Prevailing_wage)

------
TobbenTM
A staggering 500 applications from North Korea, wonder how they got out..

~~~
marchdown
What is notable here is the low approval rate.

------
maximedev
This is a really great tool. Kudos to negrit.

------
kitwalker12
Very interesting. Wonder what's the reason for the huge dip in Indian Software
Applications 2013 onwards?

[http://data.jobsintech.io/green-
cards/india?utf8=%E2%9C%93&r...](http://data.jobsintech.io/green-
cards/india?utf8=%E2%9C%93&refine%5Bjob_title%5D=Software+Engineer&refine%5Bcity%5D=&refine%5Bstate%5D=&refine%5Bexact%5D=0&commit=Refine)

~~~
dalek2point3
could be a reporting issue -- these things take time to show up in the data?

------
dataker
Interestingly:

>Soviet Union 4 | 0.0 | 2 | 0 | 0 | 2 | 100.0%

------
username3
In a company view, can you add Browse LCAs by job title?

~~~
negrit
definitively

------
ighost
Oh man Palau has NaN% software engineers accepted :(

~~~
catshirt
don't despair, maybe they have infinite engineers.

------
heliumcraft
is there anything similar for canada?

------
serge2k
Canada 0 in 2012 is weird.

