diff --git a/book/2e/02.Rmd b/book/2e/02.Rmd index 43c15dc..d7145a6 100644 --- a/book/2e/02.Rmd +++ b/book/2e/02.Rmd @@ -365,7 +365,7 @@ curl -s "https://www.gutenberg.org/files/11/11-0.txt" | grep " CHAPTER" | wc -l ``` -<1> The option `-l` specifies that `wc` should only output the number of lines that are pass into it. By default it also returns the number of characters and words. +<1> The option `-l` specifies that `wc` should only output the number of lines that are passed into it. By default it also returns the number of characters and words. You can think of piping as an automated copy and paste. Once you get the hang of combining tools using the pipe operator, you'll find that there are virtually no limits to this. @@ -403,7 +403,7 @@ The tool `echo` outputs the value you specify. The `-n` option, which stands for *newline*, specifies that `echo` should not output a trailing newline. Saving the output to a file is useful if you need to store intermediate results, for example to continue with your analysis at a later stage. -To use the contents of the file *greeting.txt* again, we can use `cat`, which reads a file prints it. +To use the contents of the file *greeting.txt* again, we can use `cat`, which reads a file and prints it. ```{console, callouts="wc"} cat greeting.txt @@ -571,7 +571,7 @@ n#! enter=FALSE, expect_prompt=TRUE ### Managing Output -Sometimes a tools or sequence of tools produces too much output to include in the book. +Sometimes a tool or sequence of tools produces too much output to include in the book. Instead of manually altering such output, I prefer to be transparent by piping it through a helper tool. You don't necessarily have to do this, especially if you're interested in the complete output. diff --git a/book/2e/03.Rmd b/book/2e/03.Rmd index 35ca8bd..b5416a5 100644 --- a/book/2e/03.Rmd +++ b/book/2e/03.Rmd @@ -241,7 +241,7 @@ This can be done with the `-t` option (instead of the `-x` option): tar -tzf logs.tar.gz | trim ``` -Is seems that this archive contains a lot of files, and they are not inside a directory. +It seems that this archive contains a lot of files, and they are not inside a directory. In order to keep the current directory clean, it's a good idea to first create a new directory using `mkdir` and extract those files there using the `-C` option. ```{console tar_mkdir} diff --git a/book/2e/07.Rmd b/book/2e/07.Rmd index 5a3fb1b..da0f77d 100644 --- a/book/2e/07.Rmd +++ b/book/2e/07.Rmd @@ -53,7 +53,7 @@ Any other files are either downloaded or generated using command-line tools. In this section I’ll demonstrate how to inspect your dataset and its properties. Because the upcoming visualization and modeling techniques expect the data to be in a rectangular shape, I’ll assume that the data is in CSV format. You can use the techniques described in [Chapter 5](#chapter-5-scrubbing-data) to convert your data to CSV if necessary. -For simplicity sake, I’ll also assume that your data has a header. +For simplicity's sake, I’ll also assume that your data has a header. In the first subsection I'll show a way to determine whether that's the case. Once you know you have a header, you can continue answering the following questions: diff --git a/book/2e/09.Rmd b/book/2e/09.Rmd index 7bc8ecb..b6a9af9 100644 --- a/book/2e/09.Rmd +++ b/book/2e/09.Rmd @@ -477,7 +477,7 @@ Now that I have a balanced training dataset and a balanced test dataset, I can c ### Running the Experiment Training a classifier in `skll` is done by defining an experiment in a configuration file. -It consists of several sections that specify, for example, where to look for the datasets, which classifiers +It consists of several sections that specify, for example, where to look for the datasets, which classifiers to use, and how to tune the model. Here's the configuration file *classify.cfg* that I'll use: ```{console bat_cfg}