American Lawful Immigration 2018 : New LPRs by Country of Birth

Posted on ven. 04 septembre 2020 in Politics • 5 min read

American Lawful Immigration 2018 : New LPRs by Country of Birth


Legal Permanent Residents (LPRs) are non-citizens legally permitted to live permanently in the United States. Let’s explore the data from the U.S. Department of Homeland Security to highlight the regions and countries of birth of those people who received a green card in 2018.

Import required libraries

In [1]:
import pandas as pd
import plotly.express as px
import plotly.io as pio
from IPython.display import Javascript

Javascript(
"""require.config({
 paths: { 
     plotly: 'https://cdn.plot.ly/plotly-latest.min'
 }
});"""
)

pio.renderers.default = 'notebook_connected'

Data pre-processing

To map the figures of american new Legal Permanent Residents in 2018, we need to merge two datasets :

  • The Persons Obtaining Legal Permanent Resident Status by Region and Country of Birth dataset, which is our main dataset containing among other things for each country of birth the number of new LPRs in 2018;
  • The ISO codes dataset, which allow us to merge, for each country of birth, the associated region and ISO-alpha3 Code.

Green Card new recipients 2018 dataset

In [2]:
df = pd.read_csv('Data/Persons_Obtaining_Lawful_Permanent_Resident_Status_by_Region_and_Country_of_Birth.csv',
                sep=';',
                skiprows=14,
                na_values=['X', 'D'])

# Cleaning
df.drop(df.iloc[:, 1:10], inplace = True, axis = 1)
df.columns = ['Country of Birth', 'LPRs 2018']
df.dropna(inplace=True)
df['LPRs 2018'] = df['LPRs 2018'].str.replace(" ", "")
df['LPRs 2018'] = df['LPRs 2018'].astype('int')

df.head()
Out[2]:
Country of Birth LPRs 2018
0 Afghanistan 12935
1 Albania 5049
2 Algeria 2123
4 Angola 176
5 Anguilla 23

ISO codes dataset

In [3]:
df_iso = pd.read_csv('Data/ISO_Codes.csv', 
                     sep=';', 
                     usecols= ['Region Name','Country or Area','ISO-alpha3 Code']
                    )

df_iso.rename(columns = {'Country or Area':'Country of Birth'}, inplace = True)

df_iso.head()
Out[3]:
Region Name Country of Birth ISO-alpha3 Code
0 Africa Algeria DZA
1 Africa Egypt EGY
2 Africa Libya LBY
3 Africa Morocco MAR
4 Africa Sudan SDN

Merging

In [4]:
df_merge = pd.merge(df, df_iso, on='Country of Birth')
df_merge = df_merge.sort_values('LPRs 2018', ascending=False)

# Add a percentage column
df_merge['Percentage'] = df_merge['LPRs 2018']/df_merge['LPRs 2018'].sum()*100

df_merge.head()
Out[4]:
Country of Birth LPRs 2018 Region Name ISO-alpha3 Code Percentage
118 Mexico 161858 North America MEX 14.787832
47 Cuba 76486 North America CUB 6.987990
39 China, People's Republic 65214 Asia CHN 5.958147
84 India 59821 Asia IND 5.465426
55 Dominican Republic 57413 North America DOM 5.245424

Mapping

In [5]:
fig = px.choropleth(df_merge, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='world',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>Persons Obtaining LPR Status by Country of Birth in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Green card recipients figures region by region

First, we want to represent the total number of new LPRs per region :

In [6]:
grouped = df_merge.groupby('Region Name').sum()
grouped.sort_values(by=['LPRs 2018'], ascending=False, inplace=True)
print(grouped)
               LPRs 2018  Percentage
Region Name                         
North America     418980   38.279269
Asia              397180   36.287556
Africa            115735   10.573897
Europe             79126    7.229189
South America      78876    7.206348
Oceania             4638    0.423742
In [7]:
fig = px.bar(df_merge, 
             x='LPRs 2018', 
             y='Region Name',
             labels={'LPRs 2018':'Green Card recipients', 'Region Name':''}, 
             title="<b>Persons Obtaining LPR Status by Region of Birth in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['orange']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Then, in order to facilitate the visualization of new LPRs region by region, we simply group data by Region Name :

In [8]:
df_area = df_merge.groupby('Region Name')
df_area.first()
Out[8]:
Country of Birth LPRs 2018 ISO-alpha3 Code Percentage
Region Name
Africa Nigeria 13952 NGA 1.274697
Asia China, People's Republic 65214 CHN 5.958147
Europe Ukraine 11879 UKR 1.085301
North America Mexico 161858 MEX 14.787832
Oceania Australia 2693 AUS 0.246041
South America Colombia 17545 COL 1.602964

North America

In [9]:
df_north_america = df_area.get_group('North America')
df_north_america = df_north_america.sort_values('LPRs 2018', ascending=True)
In [10]:
fig = px.choropleth(df_north_america, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='north america',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in North America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()
In [11]:
fig = px.bar(df_north_america.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in North America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['red']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Asia

In [12]:
df_asia = df_area.get_group('Asia')
df_asia = df_asia.sort_values('LPRs 2018', ascending=True)
In [13]:
fig = px.choropleth(df_asia, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='asia',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in Asia in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()
In [14]:
fig = px.bar(df_asia.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in Asia in 2018</b><br>" + 
                    "<i>Data Source : U.S. Department of Homeland Security</i>",
             orientation ='h', 
             color_discrete_sequence =['purple']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Africa

In [15]:
df_africa = df_area.get_group('Africa')
df_africa = df_africa.sort_values('LPRs 2018', ascending=True)
In [16]:
fig = px.choropleth(df_africa, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='africa',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                     title="<b>New LPRs by Country of Birth in Africa in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()
In [17]:
fig = px.bar(df_africa.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in Africa in 2018</b><br>" + 
             "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['yellow']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Europe

In [18]:
df_europe = df_area.get_group('Europe')
df_europe = df_europe.sort_values('LPRs 2018', ascending=True)
In [19]:
fig = px.choropleth(df_europe, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='europe',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in Europe in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()
In [20]:
fig = px.bar(df_europe.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in Europe in 2018</b><br>" + 
             "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['blue']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

South America

In [21]:
df_south_america = df_area.get_group('South America')
df_south_america = df_south_america.sort_values('LPRs 2018', ascending=True)
In [22]:
fig = px.choropleth(df_south_america, 
                    locations='ISO-alpha3 Code',
                    color='LPRs 2018',
                    hover_name='Country of Birth',
                    scope='south america',
                    labels={'LPRs 2018':'Green Card recipients'},
                    color_continuous_scale=px.colors.sequential.deep,
                    title="<b>New LPRs by Country of Birth in South America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>"
                   )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()
In [23]:
fig = px.bar(df_south_america.tail(10), #for better clearness, we just keep the first 10 values
             x='LPRs 2018', 
             y='Country of Birth',
             labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''}, 
             title="<b>New LPRs by Country of Birth in South America in 2018</b><br>" + 
                    "<i>Source : U.S. Department of Homeland Security</i>",
             orientation ='h',
             color_discrete_sequence =['green']
            )

# Style
fig.update_layout(
    font_family='Helvetica',
    font_color='grey',
    font_size=12,
    title_font_size=20
)

fig.show()

Sources