American Lawful Immigration 2018 : New LPRs by Country of Birth¶

Legal Permanent Residents (LPRs) are non-citizens legally permitted to live permanently in the United States. Let’s explore the data from the U.S. Department of Homeland Security to highlight the regions and countries of birth of those people who received a green card in 2018.

Import required libraries¶

In [1]:

import pandas as pd
import plotly.express as px
import plotly.io as pio
from IPython.display import Javascript

Javascript(
"""require.config({
 paths: { 
     plotly: 'https://cdn.plot.ly/plotly-latest.min'
 }
});"""
)

pio.renderers.default = 'notebook_connected'

Data pre-processing¶

To map the figures of american new Legal Permanent Residents in 2018, we need to merge two datasets :

The Persons Obtaining Legal Permanent Resident Status by Region and Country of Birth dataset, which is our main dataset containing among other things for each country of birth the number of new LPRs in 2018;
The ISO codes dataset, which allow us to merge, for each country of birth, the associated region and ISO-alpha3 Code.

Green Card new recipients 2018 dataset¶

In [2]:

df = pd.read_csv('Data/Persons_Obtaining_Lawful_Permanent_Resident_Status_by_Region_and_Country_of_Birth.csv',
                sep=';',
                skiprows=14,
                na_values=['X', 'D'])

# Cleaning
df.drop(df.iloc[:, 1:10], inplace = True, axis = 1)
df.columns = ['Country of Birth', 'LPRs 2018']
df.dropna(inplace=True)
df['LPRs 2018'] = df['LPRs 2018'].str.replace(" ", "")
df['LPRs 2018'] = df['LPRs 2018'].astype('int')

df.head()

Out[2]:

	Country of Birth	LPRs 2018
0	Afghanistan	12935
1	Albania	5049
2	Algeria	2123
4	Angola	176
5	Anguilla	23

ISO codes dataset¶

In [3]:

df_iso = pd.read_csv('Data/ISO_Codes.csv', 
                     sep=';', 
                     usecols= ['Region Name','Country or Area','ISO-alpha3 Code']
                    )

df_iso.rename(columns = {'Country or Area':'Country of Birth'}, inplace = True)

df_iso.head()

Out[3]:

	Region Name	Country of Birth	ISO-alpha3 Code
0	Africa	Algeria	DZA
1	Africa	Egypt	EGY
2	Africa	Libya	LBY
3	Africa	Morocco	MAR
4	Africa	Sudan	SDN

Merging¶

In [4]:

df_merge = pd.merge(df, df_iso, on='Country of Birth')
df_merge = df_merge.sort_values('LPRs 2018', ascending=False)

# Add a percentage column
df_merge['Percentage'] = df_merge['LPRs 2018']/df_merge['LPRs 2018'].sum()*100

df_merge.head()

Out[4]:

	Country of Birth	LPRs 2018	Region Name	ISO-alpha3 Code	Percentage
118	Mexico	161858	North America	MEX	14.787832
47	Cuba	76486	North America	CUB	6.987990
39	China, People's Republic	65214	Asia	CHN	5.958147
84	India	59821	Asia	IND	5.465426
55	Dominican Republic	57413	North America	DOM	5.245424

Mapping¶

In [5]:

fig = px.choropleth(df_merge, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='world',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>Persons Obtaining LPR Status by Country of Birth in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Green card recipients figures region by region¶

First, we want to represent the total number of new LPRs per region :

In [6]:

grouped = df_merge.groupby('Region Name').sum()
grouped.sort_values(by=['LPRs 2018'], ascending=False, inplace=True)
print(grouped)

               LPRs 2018  Percentage
Region Name                         
North America     418980   38.279269
Asia              397180   36.287556
Africa            115735   10.573897
Europe             79126    7.229189
South America      78876    7.206348
Oceania             4638    0.423742

In [7]:

fig = px.bar(df_merge, 
             x='LPRs 2018', 
             y='Region Name',
             labels={'LPRs 2018':'Green Card recipients', 'Region Name':''}, 
             title="<b>Persons Obtaining LPR Status by Region of Birth in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['orange']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Then, in order to facilitate the visualization of new LPRs region by region, we simply group data by Region Name :

In [8]:

df_area = df_merge.groupby('Region Name')
df_area.first()

Out[8]:

	Country of Birth	LPRs 2018	ISO-alpha3 Code	Percentage
Region Name
Africa	Nigeria	13952	NGA	1.274697
Asia	China, People's Republic	65214	CHN	5.958147
Europe	Ukraine	11879	UKR	1.085301
North America	Mexico	161858	MEX	14.787832
Oceania	Australia	2693	AUS	0.246041
South America	Colombia	17545	COL	1.602964

North America¶

In [9]:

df_north_america = df_area.get_group('North America')
df_north_america = df_north_america.sort_values('LPRs 2018', ascending=True)

In [10]:

fig = px.choropleth(df_north_america, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='north america',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in North America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

In [11]:

fig = px.bar(df_north_america.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in North America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['red']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Asia¶

In [12]:

df_asia = df_area.get_group('Asia')
df_asia = df_asia.sort_values('LPRs 2018', ascending=True)

In [13]:

fig = px.choropleth(df_asia, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='asia',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in Asia in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

In [14]:

fig = px.bar(df_asia.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in Asia in 2018</b><br>" + 
                    "<i>Data Source : U.S. Department of Homeland Security</i>",
             orientation ='h', 
             color_discrete_sequence =['purple']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Africa¶

In [15]:

df_africa = df_area.get_group('Africa')
df_africa = df_africa.sort_values('LPRs 2018', ascending=True)

In [16]:

fig = px.choropleth(df_africa, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='africa',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                     title="<b>New LPRs by Country of Birth in Africa in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

In [17]:

fig = px.bar(df_africa.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in Africa in 2018</b><br>" + 
             "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['yellow']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Europe¶

In [18]:

df_europe = df_area.get_group('Europe')
df_europe = df_europe.sort_values('LPRs 2018', ascending=True)

In [19]:

fig = px.choropleth(df_europe, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='europe',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in Europe in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

In [20]:

fig = px.bar(df_europe.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in Europe in 2018</b><br>" + 
             "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['blue']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

South America¶

In [21]:

df_south_america = df_area.get_group('South America')
df_south_america = df_south_america.sort_values('LPRs 2018', ascending=True)

In [22]:

fig = px.choropleth(df_south_america, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='south america',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in South America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

In [23]:

fig = px.bar(df_south_america.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in South America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['green']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Sources¶

U.S. Department of Homeland Security

Previous Post Next Post