American Lawful Immigration 2018 : New LPRs by Country of Birth
Posted on ven. 04 septembre 2020 in Politics • 5 min read
American Lawful Immigration 2018 : New LPRs by Country of Birth¶
Legal Permanent Residents (LPRs) are non-citizens legally permitted to live permanently in the United States. Let’s explore the data from the U.S. Department of Homeland Security to highlight the regions and countries of birth of those people who received a green card in 2018.
Import required libraries¶
In [1]:
import pandas as pd
import plotly.express as px
import plotly.io as pio
from IPython.display import Javascript
Javascript(
"""require.config({
paths: {
plotly: 'https://cdn.plot.ly/plotly-latest.min'
}
});"""
)
pio.renderers.default = 'notebook_connected'
Data pre-processing¶
To map the figures of american new Legal Permanent Residents in 2018, we need to merge two datasets :
- The Persons Obtaining Legal Permanent Resident Status by Region and Country of Birth dataset, which is our main dataset containing among other things for each country of birth the number of new LPRs in 2018;
- The ISO codes dataset, which allow us to merge, for each country of birth, the associated region and ISO-alpha3 Code.
Green Card new recipients 2018 dataset¶
In [2]:
df = pd.read_csv('Data/Persons_Obtaining_Lawful_Permanent_Resident_Status_by_Region_and_Country_of_Birth.csv',
sep=';',
skiprows=14,
na_values=['X', 'D'])
# Cleaning
df.drop(df.iloc[:, 1:10], inplace = True, axis = 1)
df.columns = ['Country of Birth', 'LPRs 2018']
df.dropna(inplace=True)
df['LPRs 2018'] = df['LPRs 2018'].str.replace(" ", "")
df['LPRs 2018'] = df['LPRs 2018'].astype('int')
df.head()
Out[2]:
ISO codes dataset¶
In [3]:
df_iso = pd.read_csv('Data/ISO_Codes.csv',
sep=';',
usecols= ['Region Name','Country or Area','ISO-alpha3 Code']
)
df_iso.rename(columns = {'Country or Area':'Country of Birth'}, inplace = True)
df_iso.head()
Out[3]:
Merging¶
In [4]:
df_merge = pd.merge(df, df_iso, on='Country of Birth')
df_merge = df_merge.sort_values('LPRs 2018', ascending=False)
# Add a percentage column
df_merge['Percentage'] = df_merge['LPRs 2018']/df_merge['LPRs 2018'].sum()*100
df_merge.head()
Out[4]:
Mapping¶
In [5]:
fig = px.choropleth(df_merge,
locations='ISO-alpha3 Code',
color='LPRs 2018',
hover_name='Country of Birth',
scope='world',
labels={'LPRs 2018':'Green Card recipients'},
color_continuous_scale=px.colors.sequential.deep,
title="<b>Persons Obtaining LPR Status by Country of Birth in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>"
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
Green card recipients figures region by region¶
First, we want to represent the total number of new LPRs per region :
In [6]:
grouped = df_merge.groupby('Region Name').sum()
grouped.sort_values(by=['LPRs 2018'], ascending=False, inplace=True)
print(grouped)
In [7]:
fig = px.bar(df_merge,
x='LPRs 2018',
y='Region Name',
labels={'LPRs 2018':'Green Card recipients', 'Region Name':''},
title="<b>Persons Obtaining LPR Status by Region of Birth in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>",
orientation ='h',
color_discrete_sequence =['orange']
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
Then, in order to facilitate the visualization of new LPRs region by region, we simply group data by Region Name :
In [8]:
df_area = df_merge.groupby('Region Name')
df_area.first()
Out[8]:
North America¶
In [9]:
df_north_america = df_area.get_group('North America')
df_north_america = df_north_america.sort_values('LPRs 2018', ascending=True)
In [10]:
fig = px.choropleth(df_north_america,
locations='ISO-alpha3 Code',
color='LPRs 2018',
hover_name='Country of Birth',
scope='north america',
labels={'LPRs 2018':'Green Card recipients'},
color_continuous_scale=px.colors.sequential.deep,
title="<b>New LPRs by Country of Birth in North America in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>"
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
In [11]:
fig = px.bar(df_north_america.tail(10), #for better clearness, we just keep the first 10 values
x='LPRs 2018',
y='Country of Birth',
labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''},
title="<b>New LPRs by Country of Birth in North America in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>",
orientation ='h',
color_discrete_sequence =['red']
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
Asia¶
In [12]:
df_asia = df_area.get_group('Asia')
df_asia = df_asia.sort_values('LPRs 2018', ascending=True)
In [13]:
fig = px.choropleth(df_asia,
locations='ISO-alpha3 Code',
color='LPRs 2018',
hover_name='Country of Birth',
scope='asia',
labels={'LPRs 2018':'Green Card recipients'},
color_continuous_scale=px.colors.sequential.deep,
title="<b>New LPRs by Country of Birth in Asia in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>"
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
In [14]:
fig = px.bar(df_asia.tail(10), #for better clearness, we just keep the first 10 values
x='LPRs 2018',
y='Country of Birth',
labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''},
title="<b>New LPRs by Country of Birth in Asia in 2018</b><br>" +
"<i>Data Source : U.S. Department of Homeland Security</i>",
orientation ='h',
color_discrete_sequence =['purple']
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
Africa¶
In [15]:
df_africa = df_area.get_group('Africa')
df_africa = df_africa.sort_values('LPRs 2018', ascending=True)
In [16]:
fig = px.choropleth(df_africa,
locations='ISO-alpha3 Code',
color='LPRs 2018',
hover_name='Country of Birth',
scope='africa',
labels={'LPRs 2018':'Green Card recipients'},
color_continuous_scale=px.colors.sequential.deep,
title="<b>New LPRs by Country of Birth in Africa in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>"
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
In [17]:
fig = px.bar(df_africa.tail(10), #for better clearness, we just keep the first 10 values
x='LPRs 2018',
y='Country of Birth',
labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''},
title="<b>New LPRs by Country of Birth in Africa in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>",
orientation ='h',
color_discrete_sequence =['yellow']
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
Europe¶
In [18]:
df_europe = df_area.get_group('Europe')
df_europe = df_europe.sort_values('LPRs 2018', ascending=True)
In [19]:
fig = px.choropleth(df_europe,
locations='ISO-alpha3 Code',
color='LPRs 2018',
hover_name='Country of Birth',
scope='europe',
labels={'LPRs 2018':'Green Card recipients'},
color_continuous_scale=px.colors.sequential.deep,
title="<b>New LPRs by Country of Birth in Europe in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>"
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
In [20]:
fig = px.bar(df_europe.tail(10), #for better clearness, we just keep the first 10 values
x='LPRs 2018',
y='Country of Birth',
labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''},
title="<b>New LPRs by Country of Birth in Europe in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>",
orientation ='h',
color_discrete_sequence =['blue']
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
South America¶
In [21]:
df_south_america = df_area.get_group('South America')
df_south_america = df_south_america.sort_values('LPRs 2018', ascending=True)
In [22]:
fig = px.choropleth(df_south_america,
locations='ISO-alpha3 Code',
color='LPRs 2018',
hover_name='Country of Birth',
scope='south america',
labels={'LPRs 2018':'Green Card recipients'},
color_continuous_scale=px.colors.sequential.deep,
title="<b>New LPRs by Country of Birth in South America in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>"
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
In [23]:
fig = px.bar(df_south_america.tail(10), #for better clearness, we just keep the first 10 values
x='LPRs 2018',
y='Country of Birth',
labels={'LPRs 2018':'Green Card recipients', 'Country of Birth':''},
title="<b>New LPRs by Country of Birth in South America in 2018</b><br>" +
"<i>Source : U.S. Department of Homeland Security</i>",
orientation ='h',
color_discrete_sequence =['green']
)
# Style
fig.update_layout(
font_family='Helvetica',
font_color='grey',
font_size=12,
title_font_size=20
)
fig.show()
Sources¶