matplotlib
diff --git a/‎docs/index.rst
Lines changed: 1 addition & 0 deletions b/‎docs/index.rst
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/tutorial/Makefile
Lines changed: 2 additions & 1 deletion b/‎docs/tutorial/Makefile
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/tutorial/closer_look_at_plot_pos.ipynb
Lines changed: 286 additions & 0 deletions b/‎docs/tutorial/closer_look_at_plot_pos.ipynb
Lines changed: 286 additions & 0 deletions
@@ -70,6 +70,7 @@ Tutorials
 
    tutorial/getting_started.rst
    tutorial/closer_look_at_viz.rst
+   tutorial/closer_look_at_plot_pos.rst
 
 Testing
 =======
 
@@ -1,4 +1,5 @@
 notebooks:
 
 	tools/nb_to_doc.py getting_started
-	tools/nb_to_doc.py closer_look_at_viz
+	tools/nb_to_doc.py closer_look_at_viz
+	tools/nb_to_doc.py closer_look_at_plot_pos
@@ -0,0 +1,286 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Using different formulations of plotting positions\n",
+    "### Looking at normal vs Weibull scales + Cunnane vs Weibull plotting positions\n",
+    "\n",
+    "When drawing a percentile, quantile, or probability plot, the potting positions of ordered data must be computed.\n",
+    "\n",
+    "For a sample $X$ with population size $n$, the plotting position of of the $j^\\mathrm{th}$ element is defined as:\n",
+    "\n",
+    "$$ \\frac{x_{j} - \\alpha}{n + 1 - \\alpha - \\beta } $$\n",
+    "\n",
+    "In this equation, α and β can take on several values. Common values are described below:"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "    \"type 4\" (α=0, β=1)\n",
+    "        Linear interpolation of the empirical CDF.\n",
+    "    \"type 5\" or \"hazen\" (α=0.5, β=0.5)\n",
+    "        Piecewise linear interpolation.\n",
+    "    \"type 6\" or \"weibull\" (α=0, β=0)\n",
+    "        Weibull plotting positions. Unbiased exceedance probability for all distributions.\n",
+    "        Recommended for hydrologic applications.\n",
+    "    \"type 7\" (α=1, β=1)\n",
+    "        The default values in R.\n",
+    "        Not recommended with probability scales as the min and max data points get plotting positions of 0 and 1, respectively, and therefore cannot be shown.\n",
+    "    \"type 8\" (α=1/3, β=1/3)\n",
+    "        Approximately median-unbiased.\n",
+    "    \"type 9\" or \"blom\" (α=0.375, β=0.375)\n",
+    "        Approximately unbiased positions if the data are normally distributed.\n",
+    "    \"median\" (α=0.3175, β=0.3175)\n",
+    "        Median exceedance probabilities for all distributions (used in ``scipy.stats.probplot``).\n",
+    "    \"apl\" or \"pwm\" (α=0.35, β=0.35)\n",
+    "        Used with probability-weighted moments.\n",
+    "    \"cunnane\" (α=0.4, β=0.4)\n",
+    "        Nearly unbiased quantiles for normally distributed data.\n",
+    "        This is the default value.\n",
+    "    \"gringorten\" (α=0.44, β=0.44)\n",
+    "        Used for Gumble distributions."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The purpose of this tutorial is to show how the selected α and β can alter the shape of a probability plot.\n",
+    "\n",
+    "First let's get some analytical setup out of the way..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "%matplotlib inline"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "import numpy\n",
+    "from matplotlib import pyplot\n",
+    "from scipy import stats\n",
+    "import seaborn\n",
+    "\n",
+    "clear_bkgd = {'axes.facecolor':'none', 'figure.facecolor':'none'}\n",
+    "seaborn.set(style='ticks', context='talk', color_codes=True, rc=clear_bkgd)\n",
+    "\n",
+    "import probscale\n",
+    "\n",
+    "\n",
+    "def format_axes(ax1, ax2):\n",
+    "    \"\"\" Sets axes labels and grids \"\"\"\n",
+    "    for ax in (ax1, ax2):\n",
+    "        if ax is not None:\n",
+    "            ax.set_ylim(bottom=1, top=99)\n",
+    "            ax.set_xlabel('Values of Data')\n",
+    "            seaborn.despine(ax=ax)\n",
+    "            ax.yaxis.grid(True)\n",
+    "        \n",
+    "    ax1.legend(loc='upper left', numpoints=1, frameon=False)\n",
+    "    ax1.set_ylabel('Normal Probability Scale')\n",
+    "    if ax2 is not None:\n",
+    "        ax2.set_ylabel('Weibull Probability Scale')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here we'll generate some fake, normally distributed data and define a Weibull distribution from scipy to use for a probability scale."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "numpy.random.seed(0)  # reproducible\n",
+    "data = numpy.random.normal(loc=5, scale=1.25, size=37)\n",
+    "\n",
+    "# simple weibull distribution\n",
+    "weibull = stats.weibull_min(2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's create probability plots on both Weibull and normal probability scales.\n",
+    "Additionally, we'll compute the plotting positions two different but commone ways for each plot.\n",
+    "\n",
+    "First, in blue circles, we'll show the data with Weibull (α=0, β=0) plotting positions.\n",
+    "Weibull plotting positions are commonly use in my field, water resources engineering.\n",
+    "\n",
+    "In green squares, we'll use Cunnane (α=0.4, β=0.4) plotting positions.\n",
+    "Cunnane plotting positions are good for normally distributed data and are the default values."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "w_opts = {'label': 'Weibull (α=0, β=0)',     'marker': 'o', 'markeredgecolor': 'b'}\n",
+    "c_opts = {'label': 'Cunnane (α=0.4, β=0.4)', 'marker': 's', 'markeredgecolor': 'g'}\n",
+    "\n",
+    "common_opts = {\n",
+    "    'markerfacecolor': 'none',\n",
+    "    'markeredgewidth': 1.25,\n",
+    "    'linestyle': 'none'\n",
+    "}\n",
+    "\n",
+    "fig, (ax1, ax2) = pyplot.subplots(figsize=(10, 8), ncols=2, sharex=True, sharey=False)\n",
+    "\n",
+    "for dist, ax in zip([None, weibull], [ax1, ax2]):\n",
+    "    for opts, postype in zip([w_opts, c_opts,], ['weibull', 'cunnane']):\n",
+    "        probscale.probplot(data, ax=ax, dist=dist, probax='y', \n",
+    "                           scatter_kws={**opts, **common_opts}, \n",
+    "                           pp_kws={'postype': postype})\n",
+    "\n",
+    "format_axes(ax1, ax2)\n",
+    "fig.tight_layout()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This demostrates that the different formulations of the plotting positions vary  most at the extreme values of the dataset. \n",
+    "\n",
+    "Next, let's compare the HAzen/Type 5 (α=0.5, β=0.5) formulation to Cunnane.\n",
+    "Hzewn plotting positions (shown as red triangles) represet a piece-wise linear interpolation of the emperical cumulative distribution function of the dataset.\n",
+    "\n",
+    "Given the values of α and β=0.5 vary only slightly from the Cunnane values, the plotting position predictably are similar."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false,
+    "scrolled": false
+   },
+   "outputs": [],
+   "source": [
+    "h_opts = {'label': 'Hazen (α=0.5, β=0.5)', 'marker': '^', 'markeredgecolor': 'r'}\n",
+    "fig, (ax1, ax2) = pyplot.subplots(figsize=(10, 8), ncols=2, sharex=True, sharey=False)\n",
+    "\n",
+    "for dist, ax in zip([None, weibull], [ax1, ax2]):\n",
+    "    for opts, postype in zip([c_opts, h_opts,], ['cunnane', 'Hazen']):\n",
+    "        probscale.probplot(data, ax=ax, dist=dist, probax='y', \n",
+    "                           scatter_kws={**opts, **common_opts}, \n",
+    "                           pp_kws={'postype': postype})\n",
+    "\n",
+    "format_axes(ax1, ax2)\n",
+    "fig.tight_layout()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "At the risk of showing a very cluttered and hard to read figure, let's throw all three on the same normal probability scale:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fig, ax1 = pyplot.subplots(figsize=(6, 8))\n",
+    "\n",
+    "for opts, postype in zip([w_opts, c_opts, h_opts,], ['weibull', 'cunnane', 'hazen']):\n",
+    "    probscale.probplot(data, ax=ax1, dist=None, probax='y', \n",
+    "                       scatter_kws={**opts, **common_opts}, \n",
+    "                       pp_kws={'postype': postype})\n",
+    "        \n",
+    "format_axes(ax1, None)\n",
+    "fig.tight_layout()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Again, the different values of α and β don't significantly alter the shape of the probability plot near between -- say -- the lower and upper quartiles.\n",
+    "Beyond the quartiles, however, the difference is more obvious.\n",
+    "\n",
+    "The cell below computes the plotting positions with the three sets of α and β values that we've investigated and prints the first ten value for easy comparison."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "# weibull plotting positions and sorted data\n",
+    "w_probs, _ = probscale.plot_pos(data, postype='weibull')\n",
+    "\n",
+    "# normal plotting positions, returned \"data\" is identical to above\n",
+    "c_probs, _ = probscale.plot_pos(data, postype='cunnane')\n",
+    "\n",
+    "# type 4 plot positions\n",
+    "h_probs, _ = probscale.plot_pos(data, postype='hazen')\n",
+    "\n",
+    "# convert to percentages\n",
+    "w_probs *= 100\n",
+    "c_probs *= 100\n",
+    "h_probs *= 100\n",
+    "\n",
+    "print('Weibull: ', numpy.round(w_probs[:10], 2))\n",
+    "print('Cunnane: ', numpy.round(c_probs[:10], 2))\n",
+    "print('Hazen:   ', numpy.round(h_probs[:10], 2))"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}