Skip to content

Commit 5d5706a

Browse files
authored
Update 11-joins.md
1 parent 73a64d4 commit 5d5706a

File tree

1 file changed

+9
-13
lines changed

1 file changed

+9
-13
lines changed

_episodes/11-joins.md

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,6 @@ In this episode we will consider different scenarios and show we might join the
3030
cases the first step will be to read the datasets into a pandas Dataframe from where we will do the joining. The csv
3131
files we are using are cut down versions of the SN7577 dataset to make the displays more manageable.
3232

33-
There are a few ways to merge files. In database lingo, a merge operation is called a `JOIN`. Some of these are
34-
shown in the table below.
35-
36-
![pandas_join_types](../fig/pandas_join_types.png)
37-
3833
First, let's download the datafiles. They are listed in the [setup page][setup-page] for the lesson. Alternatively,
3934
you can download the [GitHub repository for this lesson][gh-repo]. The data files are in the
4035
*data* directory. If you're using Jupyter, make sure to place these files in the same directory where your notebook
@@ -128,6 +123,15 @@ We can join columns from two Dataframes using the `merge()` function. This is si
128123

129124
A detailed discussion of different join types is given in the [SQL lesson](./episodes/sql...).
130125

126+
You specify the type of join you want using the `how` parameter. The default is the `inner` join which returns the columns from both tables where the `key` or common column values match in both Dataframes.
127+
128+
The possible values of the `how` parameter are shown in the picture below (taken from the Pandas documentation)
129+
130+
![pandas_join_types](../fig/pandas_join_types.png)
131+
132+
The different join types behave in the same way as they do in SQL. In Python/pandas, any missing values are shown as `NaN`
133+
134+
131135
In order to `merge` the Dataframes we need to identify a column common to both of them.
132136

133137
~~~
@@ -152,14 +156,6 @@ df_cd = pd.merge(df_SN7577i_c, df_SN7577i_d, how='inner', left_on = 'Id', right_
152156
~~~
153157
{: .language-python}
154158

155-
You specify the type of join you want using the `how` parameter. The default is the `inner` join which returns the columns from both tables where the `key` or common column values match in both Dataframes.
156-
157-
The possible values of the `how` parameter are shown in the picture below (taken from the Pandas documentation)
158-
159-
![pandas_join_types](../fig/pandas_join_types.png)
160-
161-
The different join types behave in the same way as they do in SQL. In Python/pandas, any missing values are shown as `NaN`
162-
163159

164160
> ## Exercises
165161
>

0 commit comments

Comments
 (0)