Skip to content

Commit 47f4628

Browse files
committed
Minor restructuring of Lesson 1.2.1
1 parent 5f3c13b commit 47f4628

13 files changed

+4
-676
lines changed

Lesson 1-2-1 - Introduction to Jupyter Notebook and programming in Python.ipynb

Lines changed: 2 additions & 338 deletions
Original file line numberDiff line numberDiff line change
@@ -2077,343 +2077,7 @@
20772077
"source": [
20782078
"Using `*args` and `**kwargs` in your function calls while you're developing makes it easier to change your code without having to go back through every line of code that calls your function and bug-fix when you change the order or number of arguments you're calling. \n",
20792079
"\n",
2080-
"This reduces errors, improves readability, and makes for a more enjoyable and efficient coding experience.\n",
2081-
"\n",
2082-
"At this stage, you've learned the fundamental syntax, as well as how to create modular code. There is a great deal more to learn, but we will cover only one more thing before heading back into data wrangling."
2083-
]
2084-
},
2085-
{
2086-
"cell_type": "markdown",
2087-
"metadata": {},
2088-
"source": [
2089-
"## Built-in modules\n",
2090-
"\n",
2091-
"There are a vast range of built-in modules. Jupyter Notebook comes with an even larger list of third-party modules you can explore.\n",
2092-
"\n",
2093-
"<div class=\"alert alert-block alert-info\">\n",
2094-
" <b>Syntax</b>\n",
2095-
" <br>\n",
2096-
" <ul>\n",
2097-
" <li>After you've imported a module, <code>dir(module)</code> lets you see a list of all the functions implemented in that library.</li>\n",
2098-
" <li>You can also read the help from the module docstrings with <code>help(module)</code></li>\n",
2099-
" </ul>\n",
2100-
"</div>\n",
2101-
"\n",
2102-
"Let's explore a module you'll be using and learning about in future sessions of this course, `pandas`."
2103-
]
2104-
},
2105-
{
2106-
"cell_type": "code",
2107-
"execution_count": 9,
2108-
"metadata": {},
2109-
"outputs": [
2110-
{
2111-
"name": "stdout",
2112-
"output_type": "stream",
2113-
"text": [
2114-
"Help on package pandas:\n",
2115-
"\n",
2116-
"NAME\n",
2117-
" pandas\n",
2118-
"\n",
2119-
"DESCRIPTION\n",
2120-
" pandas - a powerful data analysis and manipulation library for Python\n",
2121-
" =====================================================================\n",
2122-
" \n",
2123-
" **pandas** is a Python package providing fast, flexible, and expressive data\n",
2124-
" structures designed to make working with \"relational\" or \"labeled\" data both\n",
2125-
" easy and intuitive. It aims to be the fundamental high-level building block for\n",
2126-
" doing practical, **real world** data analysis in Python. Additionally, it has\n",
2127-
" the broader goal of becoming **the most powerful and flexible open source data\n",
2128-
" analysis / manipulation tool available in any language**. It is already well on\n",
2129-
" its way toward this goal.\n",
2130-
" \n",
2131-
" Main Features\n",
2132-
" -------------\n",
2133-
" Here are just a few of the things that pandas does well:\n",
2134-
" \n",
2135-
" - Easy handling of missing data in floating point as well as non-floating\n",
2136-
" point data\n",
2137-
" - Size mutability: columns can be inserted and deleted from DataFrame and\n",
2138-
" higher dimensional objects\n",
2139-
" - Automatic and explicit data alignment: objects can be explicitly aligned\n",
2140-
" to a set of labels, or the user can simply ignore the labels and let\n",
2141-
" `Series`, `DataFrame`, etc. automatically align the data for you in\n",
2142-
" computations\n",
2143-
" - Powerful, flexible group by functionality to perform split-apply-combine\n",
2144-
" operations on data sets, for both aggregating and transforming data\n",
2145-
" - Make it easy to convert ragged, differently-indexed data in other Python\n",
2146-
" and NumPy data structures into DataFrame objects\n",
2147-
" - Intelligent label-based slicing, fancy indexing, and subsetting of large\n",
2148-
" data sets\n",
2149-
" - Intuitive merging and joining data sets\n",
2150-
" - Flexible reshaping and pivoting of data sets\n",
2151-
" - Hierarchical labeling of axes (possible to have multiple labels per tick)\n",
2152-
" - Robust IO tools for loading data from flat files (CSV and delimited),\n",
2153-
" Excel files, databases, and saving/loading data from the ultrafast HDF5\n",
2154-
" format\n",
2155-
" - Time series-specific functionality: date range generation and frequency\n",
2156-
" conversion, moving window statistics, moving window linear regressions,\n",
2157-
" date shifting and lagging, etc.\n",
2158-
"\n",
2159-
"PACKAGE CONTENTS\n",
2160-
" _libs (package)\n",
2161-
" _version\n",
2162-
" api (package)\n",
2163-
" compat (package)\n",
2164-
" computation (package)\n",
2165-
" conftest\n",
2166-
" core (package)\n",
2167-
" errors (package)\n",
2168-
" formats (package)\n",
2169-
" io (package)\n",
2170-
" json\n",
2171-
" lib\n",
2172-
" parser\n",
2173-
" plotting (package)\n",
2174-
" stats (package)\n",
2175-
" testing\n",
2176-
" tests (package)\n",
2177-
" tools (package)\n",
2178-
" tseries (package)\n",
2179-
" tslib\n",
2180-
" types (package)\n",
2181-
" util (package)\n",
2182-
"\n",
2183-
"SUBMODULES\n",
2184-
" _hashtable\n",
2185-
" _lib\n",
2186-
" _tslib\n",
2187-
" offsets\n",
2188-
"\n",
2189-
"DATA\n",
2190-
" IndexSlice = <pandas.core.indexing._IndexSlice object>\n",
2191-
" NaT = NaT\n",
2192-
" __docformat__ = 'restructuredtext'\n",
2193-
" datetools = <module 'pandas.core.datetools' from 'C:\\\\Users\\...\\lib\\\\s...\n",
2194-
" describe_option = <pandas.core.config.CallableDynamicDoc object>\n",
2195-
" get_option = <pandas.core.config.CallableDynamicDoc object>\n",
2196-
" json = <module 'pandas.json' from 'C:\\\\Users\\\\Turukawa\\...atascience\\\\...\n",
2197-
" lib = <module 'pandas.lib' from 'C:\\\\Users\\\\Turukawa\\\\...datascience\\\\...\n",
2198-
" options = <pandas.core.config.DictWrapper object>\n",
2199-
" parser = <module 'pandas.parser' from 'C:\\\\Users\\\\Turukaw...ascience\\\\...\n",
2200-
" plot_params = {'xaxis.compat': False}\n",
2201-
" reset_option = <pandas.core.config.CallableDynamicDoc object>\n",
2202-
" set_option = <pandas.core.config.CallableDynamicDoc object>\n",
2203-
" tslib = <module 'pandas.tslib' from 'C:\\\\Users\\\\Turukawa...tascience\\\\...\n",
2204-
"\n",
2205-
"VERSION\n",
2206-
" 0.21.1\n",
2207-
"\n",
2208-
"FILE\n",
2209-
" c:\\users\\turukawa\\anaconda3\\envs\\datascience\\lib\\site-packages\\pandas\\__init__.py\n",
2210-
"\n",
2211-
"\n"
2212-
]
2213-
}
2214-
],
2215-
"source": [
2216-
"import pandas as pd\n",
2217-
"\n",
2218-
"help(pd)"
2219-
]
2220-
},
2221-
{
2222-
"cell_type": "code",
2223-
"execution_count": 10,
2224-
"metadata": {},
2225-
"outputs": [
2226-
{
2227-
"data": {
2228-
"text/plain": [
2229-
"['Categorical',\n",
2230-
" 'CategoricalIndex',\n",
2231-
" 'DataFrame',\n",
2232-
" 'DateOffset',\n",
2233-
" 'DatetimeIndex',\n",
2234-
" 'ExcelFile',\n",
2235-
" 'ExcelWriter',\n",
2236-
" 'Expr',\n",
2237-
" 'Float64Index',\n",
2238-
" 'Grouper',\n",
2239-
" 'HDFStore',\n",
2240-
" 'Index',\n",
2241-
" 'IndexSlice',\n",
2242-
" 'Int64Index',\n",
2243-
" 'Interval',\n",
2244-
" 'IntervalIndex',\n",
2245-
" 'MultiIndex',\n",
2246-
" 'NaT',\n",
2247-
" 'Panel',\n",
2248-
" 'Panel4D',\n",
2249-
" 'Period',\n",
2250-
" 'PeriodIndex',\n",
2251-
" 'RangeIndex',\n",
2252-
" 'Series',\n",
2253-
" 'SparseArray',\n",
2254-
" 'SparseDataFrame',\n",
2255-
" 'SparseList',\n",
2256-
" 'SparseSeries',\n",
2257-
" 'Term',\n",
2258-
" 'TimeGrouper',\n",
2259-
" 'Timedelta',\n",
2260-
" 'TimedeltaIndex',\n",
2261-
" 'Timestamp',\n",
2262-
" 'UInt64Index',\n",
2263-
" 'WidePanel',\n",
2264-
" '_DeprecatedModule',\n",
2265-
" '__builtins__',\n",
2266-
" '__cached__',\n",
2267-
" '__doc__',\n",
2268-
" '__docformat__',\n",
2269-
" '__file__',\n",
2270-
" '__loader__',\n",
2271-
" '__name__',\n",
2272-
" '__package__',\n",
2273-
" '__path__',\n",
2274-
" '__spec__',\n",
2275-
" '__version__',\n",
2276-
" '_hashtable',\n",
2277-
" '_lib',\n",
2278-
" '_libs',\n",
2279-
" '_np_version_under1p10',\n",
2280-
" '_np_version_under1p11',\n",
2281-
" '_np_version_under1p12',\n",
2282-
" '_np_version_under1p13',\n",
2283-
" '_np_version_under1p14',\n",
2284-
" '_np_version_under1p15',\n",
2285-
" '_tslib',\n",
2286-
" '_version',\n",
2287-
" 'api',\n",
2288-
" 'bdate_range',\n",
2289-
" 'compat',\n",
2290-
" 'concat',\n",
2291-
" 'core',\n",
2292-
" 'crosstab',\n",
2293-
" 'cut',\n",
2294-
" 'date_range',\n",
2295-
" 'datetime',\n",
2296-
" 'datetools',\n",
2297-
" 'describe_option',\n",
2298-
" 'errors',\n",
2299-
" 'eval',\n",
2300-
" 'ewma',\n",
2301-
" 'ewmcorr',\n",
2302-
" 'ewmcov',\n",
2303-
" 'ewmstd',\n",
2304-
" 'ewmvar',\n",
2305-
" 'ewmvol',\n",
2306-
" 'expanding_apply',\n",
2307-
" 'expanding_corr',\n",
2308-
" 'expanding_count',\n",
2309-
" 'expanding_cov',\n",
2310-
" 'expanding_kurt',\n",
2311-
" 'expanding_max',\n",
2312-
" 'expanding_mean',\n",
2313-
" 'expanding_median',\n",
2314-
" 'expanding_min',\n",
2315-
" 'expanding_quantile',\n",
2316-
" 'expanding_skew',\n",
2317-
" 'expanding_std',\n",
2318-
" 'expanding_sum',\n",
2319-
" 'expanding_var',\n",
2320-
" 'factorize',\n",
2321-
" 'get_dummies',\n",
2322-
" 'get_option',\n",
2323-
" 'get_store',\n",
2324-
" 'groupby',\n",
2325-
" 'infer_freq',\n",
2326-
" 'interval_range',\n",
2327-
" 'io',\n",
2328-
" 'isna',\n",
2329-
" 'isnull',\n",
2330-
" 'json',\n",
2331-
" 'lib',\n",
2332-
" 'lreshape',\n",
2333-
" 'match',\n",
2334-
" 'melt',\n",
2335-
" 'merge',\n",
2336-
" 'merge_asof',\n",
2337-
" 'merge_ordered',\n",
2338-
" 'notna',\n",
2339-
" 'notnull',\n",
2340-
" 'np',\n",
2341-
" 'offsets',\n",
2342-
" 'option_context',\n",
2343-
" 'options',\n",
2344-
" 'ordered_merge',\n",
2345-
" 'pandas',\n",
2346-
" 'parser',\n",
2347-
" 'period_range',\n",
2348-
" 'pivot',\n",
2349-
" 'pivot_table',\n",
2350-
" 'plot_params',\n",
2351-
" 'plotting',\n",
2352-
" 'pnow',\n",
2353-
" 'qcut',\n",
2354-
" 'read_clipboard',\n",
2355-
" 'read_csv',\n",
2356-
" 'read_excel',\n",
2357-
" 'read_feather',\n",
2358-
" 'read_fwf',\n",
2359-
" 'read_gbq',\n",
2360-
" 'read_hdf',\n",
2361-
" 'read_html',\n",
2362-
" 'read_json',\n",
2363-
" 'read_msgpack',\n",
2364-
" 'read_parquet',\n",
2365-
" 'read_pickle',\n",
2366-
" 'read_sas',\n",
2367-
" 'read_sql',\n",
2368-
" 'read_sql_query',\n",
2369-
" 'read_sql_table',\n",
2370-
" 'read_stata',\n",
2371-
" 'read_table',\n",
2372-
" 'reset_option',\n",
2373-
" 'rolling_apply',\n",
2374-
" 'rolling_corr',\n",
2375-
" 'rolling_count',\n",
2376-
" 'rolling_cov',\n",
2377-
" 'rolling_kurt',\n",
2378-
" 'rolling_max',\n",
2379-
" 'rolling_mean',\n",
2380-
" 'rolling_median',\n",
2381-
" 'rolling_min',\n",
2382-
" 'rolling_quantile',\n",
2383-
" 'rolling_skew',\n",
2384-
" 'rolling_std',\n",
2385-
" 'rolling_sum',\n",
2386-
" 'rolling_var',\n",
2387-
" 'rolling_window',\n",
2388-
" 'scatter_matrix',\n",
2389-
" 'set_eng_float_format',\n",
2390-
" 'set_option',\n",
2391-
" 'show_versions',\n",
2392-
" 'stats',\n",
2393-
" 'test',\n",
2394-
" 'testing',\n",
2395-
" 'timedelta_range',\n",
2396-
" 'to_datetime',\n",
2397-
" 'to_msgpack',\n",
2398-
" 'to_numeric',\n",
2399-
" 'to_pickle',\n",
2400-
" 'to_timedelta',\n",
2401-
" 'tools',\n",
2402-
" 'tseries',\n",
2403-
" 'tslib',\n",
2404-
" 'unique',\n",
2405-
" 'util',\n",
2406-
" 'value_counts',\n",
2407-
" 'wide_to_long']"
2408-
]
2409-
},
2410-
"execution_count": 10,
2411-
"metadata": {},
2412-
"output_type": "execute_result"
2413-
}
2414-
],
2415-
"source": [
2416-
"dir(pd)"
2080+
"This reduces errors, improves readability, and makes for a more enjoyable and efficient coding experience."
24172081
]
24182082
},
24192083
{
@@ -2442,7 +2106,7 @@
24422106
"name": "python",
24432107
"nbconvert_exporter": "python",
24442108
"pygments_lexer": "ipython3",
2445-
"version": "3.6.3"
2109+
"version": "3.7.7"
24462110
}
24472111
},
24482112
"nbformat": 4,

0 commit comments

Comments
 (0)