Skip to content

Commit 3851f9b

Browse files
committed
Time Series QA: Make notebooks self-contained, also adding DDL and DML
Otherwise, people or QA jobs invoking individual notebooks, or in a different order, are having a hard time.
1 parent c7c0fbe commit 3851f9b

File tree

5 files changed

+58
-7
lines changed

5 files changed

+58
-7
lines changed

topic/timeseries/exploratory_data_analysis.ipynb

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,12 +102,37 @@
102102
"engine = sa.create_engine(CONNECTION_STRING, echo=os.environ.get('DEBUG'))"
103103
]
104104
},
105+
{
106+
"cell_type": "markdown",
107+
"source": [
108+
"First, import data into CrateDB. This is a shorthand notation for the same code\n",
109+
"illustrated in `timeseries-queries-and-visualization.ipynb`, running corresponding\n",
110+
"SQL DDL and DML statements, to load the data."
111+
],
112+
"metadata": {
113+
"collapsed": false
114+
}
115+
},
116+
{
117+
"cell_type": "code",
118+
"execution_count": null,
119+
"outputs": [],
120+
"source": [
121+
"from cratedb_toolkit.datasets import load_dataset\n",
122+
"\n",
123+
"dataset = load_dataset(\"tutorial/weather-basic\")\n",
124+
"dataset.dbtable(dburi=CONNECTION_STRING, table=\"weather_data\").load()"
125+
],
126+
"metadata": {
127+
"collapsed": false
128+
}
129+
},
105130
{
106131
"cell_type": "markdown",
107132
"id": "cdae15fa",
108133
"metadata": {},
109134
"source": [
110-
"The next step fetches data from CrateDB and load it into a pandas data frame:"
135+
"Then, load data from CrateDB into a pandas data frame:"
111136
]
112137
},
113138
{

topic/timeseries/requirements-dev.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Real.
2-
# pueblo[notebook,testing]>=0.0.7
2+
pueblo[notebook,testing]>=0.0.9
33

44
# Development.
5-
pueblo[notebook,testing] @ git+https://github.com/pyveci/pueblo.git@amo/testbook
5+
# pueblo[notebook,testing] @ git+https://github.com/pyveci/pueblo.git@amo/testbook

topic/timeseries/requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
crate[sqlalchemy]==0.34.0
2+
cratedb-toolkit[datasets]==0.0.7
23
refinitiv-data<1.7
34
pandas<2
45
pycaret>=3.0,<3.4

topic/timeseries/time-series-decomposition.ipynb

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,12 +106,37 @@
106106
"engine = sa.create_engine(CONNECTION_STRING, echo=os.environ.get('DEBUG'))"
107107
]
108108
},
109+
{
110+
"cell_type": "markdown",
111+
"source": [
112+
"First, import data into CrateDB. This is a shorthand notation for the same code\n",
113+
"illustrated in `timeseries-queries-and-visualization.ipynb`, running corresponding\n",
114+
"SQL DDL and DML statements, to load the data."
115+
],
116+
"metadata": {
117+
"collapsed": false
118+
}
119+
},
120+
{
121+
"cell_type": "code",
122+
"execution_count": null,
123+
"outputs": [],
124+
"source": [
125+
"from cratedb_toolkit.datasets import load_dataset\n",
126+
"\n",
127+
"dataset = load_dataset(\"tutorial/weather-basic\")\n",
128+
"dataset.dbtable(dburi=CONNECTION_STRING, table=\"weather_data\").load()"
129+
],
130+
"metadata": {
131+
"collapsed": false
132+
}
133+
},
109134
{
110135
"cell_type": "markdown",
111136
"id": "cdae15fa",
112137
"metadata": {},
113138
"source": [
114-
"The next step fetches data from CrateDB and load it into a pandas data frame:"
139+
"Then, load data from CrateDB into a pandas data frame:"
115140
]
116141
},
117142
{

topic/timeseries/timeseries-queries-and-visualization.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -200,9 +200,9 @@
200200
"id": "226e67f8",
201201
"metadata": {},
202202
"source": [
203-
"After inserting data, it is recommended to `ANALYZE` the tables to make the query optimizer obtain\n",
204-
"important statistics information about them. Let's also invoke a `REFRESH` statement beforehand,\n",
205-
"to make sure that the data is up-to-date."
203+
"After inserting data, let's invoke a `REFRESH` statement, to make sure it is\n",
204+
"up-to-date. It is also recommended to `ANALYZE` the tables, to make the query\n",
205+
"optimizer obtain important statistics information about them."
206206
]
207207
},
208208
{

0 commit comments

Comments
 (0)