Generate tones musically from the CLI. With this package, you can pass musical notes via command line arguments and output them as audio and/or write them to a WAV file.
(coming soon)
The simplest usage is shown in the example below. A more intricate example is shown in the Usage Example section:
$tones C4.4 D3.4 E-5.2 ?.2 D4.2 C3.1
This would output a sine wave tone consisting of the following notes to whatever the current OS's audio output is set to:
- C quarter note in the 4th octave → C♩
- D quarter note in the 3rd octave → D♩
- E flat half note in the 5th octave → E♭𝅗𝅥
- a half rest (silence for duration of a half note) → 𝄼
- D half note in the 4th octave → D𝅗𝅥
- C whole note in the 3rd octave → C𝅝
There are 5 different wave shapes a tone can be. Each wave shape can be represented via one or more case insensitive strings. The following wave shapes and their strings are:
-
Sine ∿
The strings for this wave shape are:
SIN
andSINE
. -
Square ⎍
The strings for this wave shape are:
SQR
andSQUARE
. -
Triangle ∧
The strings for this wave shape are:
TRI
andTRIANGLE
. -
Saw Up ◿
The strings for this wave shape are:
SUP
,SAWUP
andSAW_UP
. -
Saw Down ◺
The strings for this wave shape are:
SDN
,SAWDOWN
andSAW_DOWN
.
-
--bpm
,b
Sets the bpm (beats per minute)/tempo of the audio. Expects a positive integer argument and defaults to
140
. -
--help
,-h
Ignores all other arguments and prints a help message about usage and the command line options and flags.
-
--note-beat-value
,-n
Sets the beat value of a note. Expects a positive integer argument and defaults to
4
. The simplest way to think of this value is the bottom value of a time signature. So if there's a time signature of 3⁄4, then4
is the beat value of a note. This affects how the duration get's applied to a note and probably doesn't have to be manually set in most cases. -
--scale
,-s
Sets the musical scale to use if an integer index is used to define the pitch of an audio string instead of a pitch letter, accidental, and octave. Defaults to
MINOR
. -
--root
,-r
Sets the root note of the scale if an integer index is used to define the pitch of an audio string instead of a pitch letter, accidental, and octave. Expects a pitch class and defaults to
A4
. -
--out
,-o
Outputs the audio to a 44.1khz/16bit WAV file. Expects a path (or filename) that doesn't point to a pre-existing file or directory.
.wav
is appended to the outputted file if it doesn't already contain a file extension. -
--quiet
,-q
Prevents audible output from being played.
-
--version
,-v
Prints the version of the package.
-
--wave
,-w
Sets the default wave shape to use for notes that don't have a wave shape specified. Expects a valid wave shape and defaults to
SINE
. The valid wave shapes can be found in the Wave Shapes section of this document.
Below is a more intricate example relative to the simple example above:
$tones --bpm 135 --wave sup --silent --o sandstorm C4.4 tri>D3.4 SQR>E-5.2 ?.2 sdn>D4.2 SIN>C3.1
This writes the audio, without audibly playing it, of the following notes at 135
bpm to a 44.1khz/16bit WAV file named sandstorm.wav
in the current working
directory:
- C quarter note sawtooth up wave in the 4th octave → C♩
- D quarter note triangle wave in the 3rd octave → D♩
- E flat half note square wave in the 5th octave → E♭𝅗𝅥
- a half rest (silence for duration of a half note) → 𝄼
- D half note sawtooth down wave in the 4th octave → D𝅗𝅥
- C whole note sine wave in the 3rd octave → C𝅝
The primary command line operand arguments that are required to set the
synthesized sound are referred to as audio strings. An audio string is a
String
that contains all the information needed to synthesize audio.
The below diagram breaks down the segments that an audio string is composed of. The diagram also further breaks down the pitch segment of an audio string. The pitch segment is the component of an audio string used to dictate the frequency of the synthesized audio:
The first segment of an audio string sets what wave shape the audio will be. This segment can be omitted to use the default wave shape. If an audio string synthesizes silence (and not audible audio), then it does not have a wave shape prefix (nor a pitch). Refer to the Wave Shapes section for the different wave shapes audio can be.
The pitch segment dictates if audio is silence or an audible tone with a frequency. Pitch is composed of 2 components:
-
The pitch class consists of:
- The leading pitch letter character consisting of one of the alpha characters A-G.
- followed by the accidental. The accidental character designates a note
as a flat ♭, natural ♮, or sharp ♯. If it's a natural
♮ then this character can be omitted. The 3 accidental characters
are:
- a minus sign
'-'
for a flat ♭. - nothing (or an equals sign
'='
) for a natural ♮. - and a plus sign
'+'
for a sharp ♯.
- a minus sign
-
And then finally the octave is the last segment of the pitch. The octave is simply a non negative integer (0 or greater) to set the octave the pitch is in.
To create silence, the pitch segment is simply a question mark character, '?'
.
An alternative way to specify the pitch/frequency in an audio string is with an integer specifying the index of a pitch in a musical scale. The index retrieves the pitch relative to the tonic (root) pitch of the scale and can be negative assuming it doesn't result in a pitch with a negative octave. For instance, to create audio consisting of quarter notes of all the pitches within the E♯ (sharp) scale in the 3rd octave, instead of having to write out all the pitch letters, accidentals, and octaves for each note, you could simply do:
$tones --scale minor --root e+3 0.4 1.4 2.4 3.4 4.4 5.4 6.4
The final segment of an audio string is the duration. The duration amount is relative to the note beat value and bpm/tempo. Without these 2 additional bits of information (the note beat value and bpm/tempo) the duration amount alone isn't enough information to extrapolate the actual span of time the audio or silence should be played.
The simplest way to think of it is that if the duration is integer N, then the length of the note will be 1⁄N. So, if N were 1, then the duration would be 1⁄1 which would be a whole note. If N were 4, that'd result in 1⁄4 so it'd be a quarter note etc.
Double accidentals are not supported. Instead enharmonically equivalent notes should be used. For instance, C♯♯ (C double sharp) is enharmonically equivalent to D♮ (D natural). This means that if the pitch letters, accidentals, and or octaves don't seem to match a scale, that's due to the fact that scales always get normalized to the ones within this API that are enharmonically equivalent.