-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathexample.dsd
257 lines (209 loc) · 10.8 KB
/
example.dsd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
# example.dsd
#
# This file contains a "DSD Message." DSD stands for "Dynamic Structured Data."
# It is a specification for messages that look a little like JSON, but have
# comments and a few features that makes it slightly easier to parse on small
# (8-bit) CPUs. But don't worry about that right now. This document is meant
# to give you the "flavour" of DSD without weighing you down with specifics.
# The first thing you'll notice about DSD is that (like JSON) virtually every
# DSD message you'll likely see is a dictionary (pretty much the same thing
# as what JSON calls an "object.") This is because DSD messages, like JSON
# texts are frequently used to communicate "structured data." Dictionaries
# in DSD start with the open-curly-brace character ({) and end with the
# close-curly-brace character (}).
{
# Now let's add a key-value entry to the dictionary. Here we're adding the
# "revision" entry. Like JSON, our entry keys are strings of characters
# surrounded by double quotes. The value of this dictionary entry is the
# integer 55.
"revision" = 55
# Or for people who prefer hexadecimal integers:
"magic number" = $B1
# Now let's add a floating point number. In JSON, floats and ints are both
# of type "number." Not so with DSD.
"percent complete" = 87.95
# Note that key values can be any valid string. They're enclosed in double
# quotes so it's not like we'll think the space is anything other than just
# another character in a dictionary entry name.
# You might also notice there's no comma between the "revision" and "magic
# number" entries; DSD syntax is defined so there's no ambiguity about
# where one token ends and another begins so commas are superfluous.
# Here are a few more numbers, including a couple in scientific notation
"a negative floating point number" = -2.17
"for science people" = 6.02E23
"physicists probably recognize this one" = -1.602E-19
# Booleans are another type in DSD. The string literal "TRUE" means true and
# the string literal "FALSE" means (you guessed it) false. The string
# literals "TRUE" and "FALSE" are case-insensitive, however. Things
# that don't mean false: the string "false", 0, no, NEIN, xFF.
"in process" = *TRUE
"complete" = *FALSE
"booleans in DSD are case insensitive" = *TrUe
# DSD defines the "Undefined Type" and uses the literal string "NIL" to
# denote it. In this example, we're saying the the value for "next step"
# is NIL:
"next step" = *NIL
# Okay. Let's look at some strings. There are two types of strings: text and
# binary. Text strings are just characters inside double quotes.
"message" = "This is a \"super\" message."
"other message" = "# comments don't work in strings."
"still another message" = "Hey, did you know that DSD strings can have
newlines embedded in them? It's TRUE!"
# Three things you may have guessed: If you want to create a string with a
# double-quote inside it, you use a backslash (\") quote digraph so the
# parser doesn't think it's at the end of the string. The comment
# character (#) doesn't do anything inside a string. The second string
# defined above contains all the characters you see there; it's not a
# comment. Lastly, characters that are in strings are IN strings, it doesn't
# matter if that character is a tab, a newline or a carriage return. This
# lets us do things like this:
"html file" = "<!DOCTYPE html>
<html>
<body>
<!-- Escape double-quotes in DSD by doubling them -->
<!-- Hash marks inside DSD strings aren't comments -->
<p class=\"comment\"><a href=\"#comment1\">Comment 1</a>: I'm a comment</p>
</body>
</html>"
# Binary strings are sequences of octets represented in BASE64 or BASE16.
# BASE64 binary strings are delimited with single quotes, like so:
"some binary data" = 'dlYI5FmUUZ+GIB219OUtzw=='
"same binary data" = (76 56 08 E4 59 94 51 9F 86 20 1D B5 F4 E5 2D CF)
"binary data, part two" = 'd lYI5F mUUZ+ GIB219O U tzw = ='
# That last example is intended to show you that white-space inside a binary
# BASE64 string is ignored. Actually everything that's not a valid BASE64
# character is ignored, so this string encodes the same sequence of octets
# as the three strings above. Please don't ever do this, however.
"binary data, part deux" = 'd?lYI5F????mUUZ+?GIB219O#U"tzw}=:='
# Note that we embedded a hash character in this binary string. It also
# doesn't imply you're starting a comment. Comments DO work inside base16
# binary strings so we can do things like this:
"remember those old hexedit programs?" = (
c1 b9 ba 79 b7 3e a3 8f 64 a6 da 2f 54 42 e1 d3 # ...y.>..d../TB..
6d 59 2d cc 52 84 c9 ed 25 65 47 6b f4 bf bd 45 # mY-.R...%eGk...E
67 a1 c6 37 67 38 4d f1 0e ef 90 94 ae 56 78 88 # g..7g8M......Vx.
2e cd 7f fe d2 6d e7 4b 20 b1 7c 3f ad af fd 36 # .....m.K .|?...6
x43 ed 27 76 1b 27 07 43 4f a5 dc 4d ed 99 22 04 # c.'v.'.CO..M..".
)
# Also, the hex digits inside a base16 binary string can use upper or lower
# case letters and *MAY* prefix the hex digits with x's. It's okay to prefix
# a base16 digit with x's because everything that's not a valid digit or
# comment character is ignored.
"surprisingly valid base16 binary string" = (&&9Dxt22) # 9D 22
# DSD also does arrays, and they behave sort of like you expect
"i am an array" = [ *nil *true *false 0 3.14 "string" 'rmqN' ( 9d 22 ) ]
# And you can include dictionaries inside dictionaries
"yet another dictionary" = {
"one" = 1
"two" = 2
}
}
# Okay. This is where things get weird. DSD messages are assumed to be arrays.
# So when you parse a bit of text, it's as if you put square braces ([) & (])
# around it. So it's perfectly valid to start up with another dictionary after
# you finish the previous one:
{
"success" = *false
"error" = "The file was not found"
"errno" = 2
}
# You can even just throw down some basic types. It's okay.
"This is a ""string""."
42
( DE AD BE EF )
# And now things get really weird.
#
# DSD is a transfer syntax that does not have a concretized type system.
# This is unlike JSON which inherits JavaScript's types. JSON integers are
# limited to numbers that can be represented by 32 bits. In DSD we don't
# care how many bits your integers are because it's a transfer syntax, not
# a serialization format. (Technically DSD is an abstract transfer syntax and
# DSD/TEXT is the concretized transfer syntax described here.) DSD/TEXT
# describes the rules for identifying the beginning and end of a lexical
# element and rules for what it is supposed to mean.
#
# If you're on a system that only supports 16 bit integers in hardware, you
# *may* only want to receive numbers between -32768 and 65535 (inclusive.)
# If you're on a 32 bit system, you may only want to receive integers
# represent-able with 32 bits.
#
# As a message receiver, you can't force a sender to only send you values you
# can consume. But as a message sender, you can tell the receiver what
# concretized type system you intend to use. In DSD, you do this with type
# system annotations, and they look like this:
@t
# This is the type system annotation for the "tiny" type system. If you're a
# message sender and you put this in the stream, it means you promise to only
# send messages that contain integers represent-able with 8 bit integers, 16
# bit IEEE 754-2008 binary floating point number and strings that are no more
# than 255 octets long.
#
# Tiny is the DEFAULT concretized type system. If you encode a value that
# exceeds the TINY limitations without signaling a different type system
# (by using the @s, @m or @u tokens) you'll create an invalid DSD message.
#
# The tiny concretized type system is intended to be used on extremely
# constrained 8-bit micro-controllers, much like the ones embedded in IoT
# "smart tags" that are powered intermittently via RF transmission.
#
# Here's an example:
{
"integer minimum" = -128
"integer maximum" = 127
"float minimum" = -5.96046E-8
"float maximum" = 6.5504E4
}
@s
# This is the type system annotation for the "small" type system. This means
# you'll only send messages that are represent-able by 16 bit integers, 32 bit
# IEEE 754-2008 decimal floating point numbers and strings that are no more
# than 65535 octets long.
#
# The small concretized type system is intended for use by "beefy" 8-bit
# micro-controllers often found in temperature sensors or 1980's home computers.
# (yes, parsing DSD messages on Commodore 64s and Apple ][s is a documented
# use case.)
@m
# This is the type system annotation for the "medium" type system. It means
# you promise to only send messages whose integers are represent-able in 32
# bits, floats will be limited to 64 bit IEEE 754-2008 decimals and strings
# are shorter than 2^32 octets.
#
# The medium concretized type system is intended for net-books & mobile phones
# powered by 32-bit ARM CPUs.
@l
# This is the type system annotation for the "large" type system. It's limits
# are 64 bit ints, 128 bit IEEE 754-2008 decimals and 2^44 octet strings.
#
# The large concretized type system is intended for modern 64-bit servers.
@u
# This is the "unlimited" type system. When a sender sends this, it means they
# can't guarantee values they send will have any limitations. Hopefully the
# receiver will have an arbitrary multi-precision library handy and lots of
# memory.
# Remember, type annotations do not force message receivers to accept messages
# that use a particular level of concretized type system. They are used by
# senders to alert message receivers the sender promises to constrain values
# sent to those represented by the type system the message is annotated with.
#
# DSD does not specify what should happen when a sender annotates a message
# with a type system contract it can't honor. Maybe it should send an error
# response? Maybe it should just try to decode messages as best it can? Maybe
# it could store the message and forward it to a system that can decode it?
# There's no one right thing to do which is why DSD itself doesn't define
# processing expectations for this situation. If someone tells you they might
# send you values you can't parse, you're still pretty much on your own as to
# what you decide to do about it.
#
# Type system annotations can appear anywhere in a message, but users are
# encouraged to add them before they send integers, floating point numbers or
# strings.
#
# One last note... about UTF-8... DSD is encoding-independent (well... okay...
# it assumes ASCII. But as long as your encoding maps code points 32 through
# 127 as being identical to ASCII, DSD doesn't care. The only valid place for
# non-ascii characters to appear are inside strings. As long as your encoding
# can't include a code-point that uses character 0x22 (the double quote
# character.) It turns out that both UTF-8 and Shift-JIS work fine; neither
# of these systems will produce multi-octet code-points that include the
# double quote (") character.