Skip to content

Commit d0da442

Browse files
committed
v0.8.0: add a confidence metric and filter
- how much of a clear winner is the attack time the algorithm chooses? or would it be hard to pick out the peak because there's high response in the neighborhood? (say, a ringing attack, or a two-part attack like a clap sound) - is it far enough away from the chosen attack time that it would actually impact the sync? based on all that, I designed a goofy lil confidence metric with some fun math, and it seems to do a reasonable job at matching my own visual inspections - also there's now an option of only unbiasing charts/files above a certain confidence. this is helpful I swear
1 parent 8411ead commit d0da442

File tree

5 files changed

+276
-39
lines changed

5 files changed

+276
-39
lines changed

README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,30 @@ This is the most informative plot (imo), and can also help identify other sync i
103103
![Convolution response of Perfect (ITG1)](doc/bias-postkernel-Perfect.png)
104104

105105

106+
## Confidence
107+
108+
Not every track has a sharply defined beat throughout. Sometimes the sync fingerprint can pinpoint the attack clearly; other tracks might exhibit a lot of uncertainty, or the simfile skeleton might not define the correct BPMs. This tool is (for the moment) only interested in offset identification and adjustment, and we don't want to mess with files that are unclear - or in a state where moving the offset won't make the sync better. With that in mind, a **confidence metric** is introduced.
109+
110+
### What makes a good confidence metric?
111+
What could cause the algorithm to pick an incorrect sync bias? Let's consider the following:
112+
- How much of a clear winner is the attack time the algorithm chooses? Or, would it be hard to pick out the peak because there's high response in the neighborhood? (say, a ringing attack, or a two-part attack like a clap sound)
113+
- Is this extra algorithmic response far enough away from the chosen attack time that it would actually impact the sync?
114+
115+
With that in mind, the following calculations are performed:
116+
1. For each point in the flattened convolution response (the white squiggle in the sync fingerprint plots), measure the following:
117+
- *v*, the relative distance from the response's median, compared to the identified peak.
118+
- *d*, the time difference from the identified peak.
119+
1. Balance these two measurements using power functions and multiply them together to calculate the "unconfidence" this point contributes. (The current confidence calculation uses *v*^4 × *d*^1.5.)
120+
1. Sum all of these "unconfidence" values and subtract from 1 to obtain a "confidence" value.
121+
1. Apply some perceptual scaling and express as a percentage.
122+
123+
The actual values returned from the confidence metric don't have an intrinsic meaning - that is, there's nothing in the plot you can point to that directly produces it - but it's expected that "messier" plots result in lower confidence, and "sharper" plots in higher confidence.
124+
125+
Note that a value of 100% or near-100% confidence does not mean the current sync is *correct*, just that the algorithm can't see anything in the rest of the fingerprint to convince it that the peak could possibly lie elsewhere.
126+
127+
The GUI includes a control to tune the minimum confidence to apply unbiasing at; it's expressed in percentage out of 100%. The CLI also offers this parameter, but in terms of proportion out of unity - for example, if the user wants to only apply unbiasing over 80% confidence, they would pass `--confidence 0.80` at the command line invocation. The CSV output also expresses the confidence as a proportion out of unity.
128+
129+
106130
## Future plans
107131
- Code cleanup
108132
- Performance optimization (need to move to MVC model :weary:)

nine-or-null/nine_or_null.ipynb

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,143 @@
253253
" if os.path.isdir(os.path.join(pack_dir, d)):\n",
254254
" print(d)\n"
255255
]
256+
},
257+
{
258+
"cell_type": "code",
259+
"execution_count": null,
260+
"metadata": {},
261+
"outputs": [],
262+
"source": [
263+
"a = 3.14159265\n",
264+
"d = f'{a:0.3f}'\n",
265+
"print(d)\n",
266+
"b = '{:0.3f}'\n",
267+
"c = b.format(a)\n",
268+
"print(c)"
269+
]
270+
},
271+
{
272+
"cell_type": "code",
273+
"execution_count": null,
274+
"metadata": {},
275+
"outputs": [],
276+
"source": [
277+
"for test_simfile_path in [\n",
278+
" r'C:\\Games\\ITGmania\\Songs\\ephemera v0.2\\crew -All Hands on the Deck-\\crew.ssc',\n",
279+
" r'C:\\Games\\ITGmania\\Songs\\ephemera v0.2\\PPBQ\\ppbq.ssc',\n",
280+
" r'C:\\Games\\ITGmania\\Songs\\ephemera v0.2\\Adrenalina\\adrenalina.ssc'\n",
281+
"]:\n",
282+
" base_simfile = simfile.open(test_simfile_path)\n",
283+
" for chart_index, chart in enumerate(base_simfile.charts):\n",
284+
" if any(k in chart for k in ['OFFSET', 'BPMS', 'STOPS', 'DELAYS', 'WARPS']):\n",
285+
" print(f'{base_simfile.title}: {chart_index} ({chart.difficulty}) has split timing')\n",
286+
"\n",
287+
" # for k, v in base_simfile.charts[1].items():\n",
288+
" # print(f'{k}: {v}')"
289+
]
290+
},
291+
{
292+
"cell_type": "code",
293+
"execution_count": null,
294+
"metadata": {},
295+
"outputs": [],
296+
"source": [
297+
"a = [None, 3]\n",
298+
"print(os.path.join(os.getcwd(), '*'))"
299+
]
300+
},
301+
{
302+
"cell_type": "code",
303+
"execution_count": null,
304+
"metadata": {},
305+
"outputs": [],
306+
"source": [
307+
"# Two aspects to confidence:\n",
308+
"# - is the overall max response a clear winner, or are there other contenders?\n",
309+
"# - (second-highest - median) / (highest - median)\n",
310+
"# - is the response outside of the max tight, or does it have a lot of variance/noise?\n",
311+
"# - stdev after scaling\n",
312+
"\n",
313+
"import os\n",
314+
"import csv\n",
315+
"import glob\n",
316+
"import numpy as np\n",
317+
"from matplotlib import pyplot as plt\n",
318+
"\n",
319+
"edge_discard = 3\n",
320+
"\n",
321+
"conv_list = glob.glob(r'C:\\Games\\ITGmania\\Songs\\ITL Online 2023\\__bias-check\\convolution-*.csv')\n",
322+
"\n",
323+
"conv_data = {}\n",
324+
"conv_results = []\n",
325+
"\n",
326+
"for i, f in enumerate(conv_list):\n",
327+
" t = []\n",
328+
" v = []\n",
329+
" with open(f, 'r', encoding='ascii') as fp:\n",
330+
" reader = csv.reader(fp)\n",
331+
" for row in reader:\n",
332+
" t += [float(row[0])]\n",
333+
" v += [float(row[1])]\n",
334+
" v_clip = v[edge_discard:-edge_discard]\n",
335+
" v_clip = np.interp(v_clip, (min(v_clip), max(v_clip)), (0, 1))\n",
336+
" t_clip = np.array(t[edge_discard:-edge_discard])\n",
337+
" v_std = np.std(v_clip)\n",
338+
" v_mean = np.mean(v_clip)\n",
339+
" v_median = np.median(v_clip)\n",
340+
" v_argmax = np.argmax(v_clip)\n",
341+
" v_max = v_clip[v_argmax]\n",
342+
" v_20 = np.percentile(v_clip, 20)\n",
343+
" v_80 = np.percentile(v_clip, 80)\n",
344+
"\n",
345+
" # Local maxima\n",
346+
" v_moving_diff = v_clip[1:] - v_clip[:-1]\n",
347+
" v_local_maxima = [i+1 for i, vmd in enumerate(zip(v_clip[:-2], v_clip[1:-1], v_clip[2:])) if (vmd[1] > vmd[0]) and (vmd[1] > vmd[2])]\n",
348+
" v_peaks = sorted(zip(v_local_maxima, v_clip[v_local_maxima]), key=lambda x: x[1], reverse=True)[:6]\n",
349+
" # print([f'{v[0]}: {v[1]}' for v in v_peaks])\n",
350+
" N_SAMPLES_NOT_NEAR = 10\n",
351+
" v_peaks_not_near = [v for v in v_peaks if abs(v[0] - v_peaks[0][0]) > N_SAMPLES_NOT_NEAR]\n",
352+
" maxness = 0\n",
353+
" if len(v_peaks_not_near) > 0:\n",
354+
" maxness = (v_peaks_not_near[0][1] - v_median) / (v_peaks[0][1] - v_median)\n",
355+
"\n",
356+
" # Another approach...\n",
357+
" THEORETICAL_UPPER = 0.83\n",
358+
" NEARNESS_SCALAR = 10 # milliseconds\n",
359+
" NEARNESS_OFFSET = 0.5 # milliseconds\n",
360+
" \n",
361+
" v_max_check = np.vstack((np.zeros_like(v_clip), (v_clip - v_median) / (v_max - v_median)))\n",
362+
" v_max_rivaling = np.max(v_max_check, axis=0)\n",
363+
" t_close_check = np.vstack((np.zeros_like(t_clip), abs(t_clip - t_clip[v_argmax]) - NEARNESS_OFFSET)) / NEARNESS_SCALAR\n",
364+
" t_close_enough = np.max(t_close_check, axis=0)\n",
365+
" max_influence = np.power(v_max_rivaling, 4) * np.power(t_close_enough, 1.5)\n",
366+
" total_max_influence = np.sum(max_influence) / np.size(max_influence)\n",
367+
" confidence = min(1, (1 - np.power(total_max_influence, 0.2)) / THEORETICAL_UPPER)\n",
368+
"\n",
369+
" print(f'{i:3d}/{len(conv_list):3d} {os.path.split(f)[1]:50s}: median = {v_median:0.6f}, stdev = {v_std:0.6f}, confidence = {confidence*100:0.2f}%')\n",
370+
" conv_results.append((os.path.split(f)[1], confidence))\n",
371+
"\n",
372+
" if confidence < 0.75 or i % 20 == 0:\n",
373+
" plt.figure(figsize=(6, 6))\n",
374+
" plt.title(os.path.split(f)[1] + f'\\nstdev = {v_std:0.6f}, iqr = {v_80-v_20:0.6f}, confidence = {confidence*100:0.2f}%')\n",
375+
" # plt.plot(t[edge_discard:-edge_discard], v[edge_discard:-edge_discard])\n",
376+
" # plt.plot(sorted(v_clip))\n",
377+
" plt.plot(v_clip)\n",
378+
" plt.plot(max_influence, 'm')\n",
379+
" #plt.plot(v_max_rivaling, 'c')\n",
380+
" # plt.plot(np.full_like(v_clip, v_mean + v_std*3), 'r')\n",
381+
" # plt.plot(np.full_like(v_clip, v_mean), 'g')\n",
382+
" # plt.plot(np.full_like(v_clip, v_mean - v_std*3), 'r')\n",
383+
" plt.plot(np.full_like(v_clip, v_20), 'c')\n",
384+
" plt.plot(np.full_like(v_clip, v_median), 'g')\n",
385+
" plt.plot(np.full_like(v_clip, v_80), 'c')\n",
386+
" #plt.plot([v[0] for v in v_peaks_not_near], [v[1] for v in v_peaks_not_near], 'k.')\n",
387+
" plt.savefig(\"conf-\" + os.path.split(f)[1][12:-4] + \".png\")\n",
388+
" plt.close()\n",
389+
"\n",
390+
"for v in sorted(conv_results, key=lambda v: v[1]):\n",
391+
" print(f'{v[0]:50s}: {v[1]*100:0.2f}% confidence')\n"
392+
]
256393
}
257394
],
258395
"metadata": {

nine-or-null/nine_or_null/__init__.py

Lines changed: 66 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
_VERSION = '0.7.1'
1+
_VERSION = '0.8.0'
22

33
from collections.abc import Container
44
import csv
@@ -25,6 +25,9 @@
2525
'path',
2626
'slot',
2727
'bias',
28+
'conf',
29+
'interquintile',
30+
'stdev',
2831
'paradigm',
2932
'timestamp',
3033
'fingerprint_ms',
@@ -47,6 +50,7 @@
4750
'consider_null': 'Consider charts close enough to 0ms bias to be "correct" under the null (StepMania) sync paradigm.',
4851
'consider_p9ms': 'Consider charts close enough to +9ms bias to be "correct" under the In The Groove sync paradigm.',
4952
'tolerance': 'If a simfile\'s sync bias lands within a paradigm ± this tolerance, that counts as "close enough".',
53+
'confidence_limit': 'If the confidence in a simfile\'s sync bias is below this value, it will not be considered for unbiasing.',
5054
'fingerprint_ms': '[ms] Time margin on either side of the beat to analyze.',
5155
'window_ms': '[ms] The spectrogram algorithm\'s moving window parameter.',
5256
'step_ms': '[ms] Controls the spectrogram algorithm\'s overlap parameter, but expressed as a step size.',
@@ -56,6 +60,9 @@
5660
'full_spectrogram': 'Analyze the full spectrogram in one go - this will make the program run slower...',
5761
'to_paradigm': 'Choose a target paradigm for the pack unbiasing step. This will modify your simfiles!'
5862
}
63+
_THEORETICAL_UPPER = 0.83
64+
_NEARNESS_SCALAR = 10 # milliseconds
65+
_NEARNESS_OFFSET = 0.5 # milliseconds
5966

6067
class FloatRange(Container):
6168
# Endpoint inclusive.
@@ -89,6 +96,7 @@ class KernelTarget(IntEnum):
8996
'consider_null': True,
9097
'consider_p9ms': True,
9198
'tolerance': 3.0,
99+
'confidence_limit': 80,
92100
'fingerprint_ms': 50,
93101
'window_ms': 10,
94102
'step_ms': 0.2,
@@ -534,10 +542,32 @@ def check_sync_bias(simfile_dir, base_simfile, chart_index=None, report_path=Non
534542
fingerprint_times_ms = fingerprint_times * 1e3
535543

536544
# Choose the highest response to the convolution as the downbeat attack
537-
sync_bias_ms = fingerprint_times_ms[np.argmax(post_kernel_flat[edge_discard:-edge_discard]) + edge_discard] + magic_offset_ms
545+
post_kernel_clip = post_kernel_flat[edge_discard:-edge_discard]
546+
i_max = np.argmax(post_kernel_clip)
547+
sync_bias_ms = fingerprint_times_ms[i_max + edge_discard] + magic_offset_ms
538548
probable_bias = guess_paradigm(sync_bias_ms, short_paradigm=False, **kwargs)
539549
# print(f'Sync bias: {sync_bias:0.3f} ({probable_bias})')
540550

551+
# Calculate a confidence statistic based on the presence of conflicting
552+
# high-level response distant from the chosen peak
553+
v_clip = np.interp(post_kernel_clip, (min(post_kernel_clip), max(post_kernel_clip)), (0, 1))
554+
t_clip = fingerprint_times_ms[edge_discard:-edge_discard]
555+
v_std = np.std(v_clip)
556+
v_mean = np.mean(v_clip)
557+
v_median = np.median(v_clip)
558+
v_20 = np.percentile(v_clip, 20)
559+
v_80 = np.percentile(v_clip, 80)
560+
v_max = v_clip[i_max]
561+
v_max_check = np.vstack((np.zeros_like(v_clip), (v_clip - v_median) / (v_max - v_median)))
562+
v_max_rivaling = np.max(v_max_check, axis=0)
563+
t_close_check = np.vstack((np.zeros_like(t_clip), abs(t_clip - t_clip[i_max]) - _NEARNESS_OFFSET)) / _NEARNESS_SCALAR
564+
t_close_enough = np.max(t_close_check, axis=0)
565+
max_influence = np.power(v_max_rivaling, 4) * np.power(t_close_enough, 1.5)
566+
total_max_influence = np.sum(max_influence) / np.size(max_influence)
567+
sync_confidence = min(1, (1 - np.power(total_max_influence, 0.2)) / _THEORETICAL_UPPER)
568+
conv_interquintile = v_80 - v_20
569+
conv_stdev = v_std
570+
541571
full_title = get_full_title(base_simfile)
542572

543573
plot_tag_vars = kwargs.get('tag_vars', {})
@@ -553,16 +583,21 @@ def check_sync_bias(simfile_dir, base_simfile, chart_index=None, report_path=Non
553583
fingerprint['steps_type'] = chart['STEPSTYPE']
554584
fingerprint['chart_slot'] = chart['DIFFICULTY']
555585
chart_tag = ' ' + slot_abbreviation(chart['STEPSTYPE'], chart['DIFFICULTY'], chart_index=chart_index, paradigm=guess_paradigm(sync_bias_ms, **kwargs))
556-
fingerprint['sample_rate'] = audio.frame_rate
557-
fingerprint['beat_digest'] = digest
586+
fingerprint['sample_rate'] = audio.frame_rate
587+
fingerprint['beat_digest'] = digest
558588
fingerprint['beat_indices'] = np.array(beat_indices)
559-
fingerprint['freq_domain'] = acc
560-
fingerprint['post_kernel'] = post_kernel
561-
fingerprint['convolution'] = post_kernel_flat
562-
fingerprint['frequencies'] = frequencies * 1e-3
563-
fingerprint['time_values'] = fingerprint_times_ms
564-
fingerprint['bias_result'] = sync_bias_ms
565-
fingerprint['plots_title'] = f'Sync fingerprint{plot_tag}\n{simfile_artist} - "{full_title}"{chart_tag}\nSync bias: {sync_bias_ms:+0.1f} ms ({probable_bias})'
589+
fingerprint['freq_domain'] = acc
590+
fingerprint['post_kernel'] = post_kernel
591+
fingerprint['convolution'] = post_kernel_flat
592+
fingerprint['frequencies'] = frequencies * 1e-3
593+
fingerprint['time_values'] = fingerprint_times_ms
594+
fingerprint['bias_result'] = sync_bias_ms
595+
fingerprint['confidence'] = sync_confidence
596+
fingerprint['conv_stdev'] = conv_stdev
597+
fingerprint['conv_quint'] = conv_interquintile
598+
fingerprint['plots_title'] = \
599+
f'Sync fingerprint{plot_tag}\n{simfile_artist} - "{full_title}"{chart_tag}' + \
600+
f'\n{sync_bias_ms:+0.1f} ms bias ({probable_bias}), {round(sync_confidence*100):d}% conf'
566601

567602
sanitized_title = slugify(full_title + chart_tag, allow_unicode=False)
568603
target_axes = []
@@ -574,6 +609,12 @@ def check_sync_bias(simfile_dir, base_simfile, chart_index=None, report_path=Non
574609

575610
plot_fingerprint(fingerprint, target_axes, **kwargs)
576611

612+
# DEBUG: convolution output for confidence research
613+
with open(os.path.join(report_path, f'convolution-{sanitized_title}.csv'), 'w', newline='', encoding='ascii') as conv_fp:
614+
writer = csv.writer(conv_fp)
615+
for t, v in zip(fingerprint_times_ms, post_kernel_flat):
616+
writer.writerow([f'{t:0.6f}', f'{v:0.6f}'])
617+
577618
for i, v in enumerate(['freqdomain', 'beatdigest', 'postkernel']):
578619
fig = target_figs[i]
579620
if show_intermediate_plots:
@@ -696,6 +737,9 @@ def batch_process(root_path=None, **kwargs):
696737
for split_chart in charts_within:
697738
fp = check_sync_bias(p, base_simfile, chart_index=split_chart, save_plots=True, show_intermediate_plots=False, **kwargs)
698739
sync_bias_ms = fp['bias_result']
740+
sync_confidence = fp['confidence']
741+
conv_quint = 'conv_quint' in fp and f"{fp['conv_quint']:0.6f}" or '----'
742+
conv_stdev = 'conv_stdev' in fp and f"{fp['conv_stdev']:0.6f}" or '----'
699743

700744
chart_abbr = '*'
701745
if split_chart is not None:
@@ -707,14 +751,16 @@ def batch_process(root_path=None, **kwargs):
707751

708752
logging.info(f'\t{fp_lookup}')
709753
logging.info(f'\tderived sync bias = {sync_bias_ms:+0.1f} ms ({guess_paradigm(sync_bias_ms, short_paradigm=False, **kwargs)})')
754+
logging.info(f'\tbias confidence = {round(sync_confidence*100):3d}% (interquintile spread = {conv_quint}, stdev = {conv_stdev})')
710755
if gui_hook is not None:
711756
row_index = len(fingerprints)-1
712757
gui_hook.grid_results.InsertRows(row_index, 1)
713758
gui_hook.grid_results.SetCellValue(row_index, 0, os.path.relpath(p, root_path))
714759
gui_hook.grid_results.SetCellValue(row_index, 1, chart_abbr)
715760
gui_hook.grid_results.SetCellValue(row_index, 2, f'{sync_bias_ms:+0.1f}')
716-
gui_hook.grid_results.SetCellValue(row_index, 3, guess_paradigm(sync_bias_ms, **kwargs))
717-
gui_hook.grid_results.MakeCellVisible(row_index, 3)
761+
gui_hook.grid_results.SetCellValue(row_index, 3, f'{round(sync_confidence*100):3d}%')
762+
gui_hook.grid_results.SetCellValue(row_index, 4, guess_paradigm(sync_bias_ms, **kwargs))
763+
gui_hook.grid_results.MakeCellVisible(row_index, 4)
718764
for j in range(4):
719765
gui_hook.grid_results.SetReadOnly(row_index, j)
720766
gui_hook.grid_results.ForceRefresh()
@@ -724,6 +770,9 @@ def batch_process(root_path=None, **kwargs):
724770
'path': os.path.relpath(p, root_path),
725771
'slot': chart_abbr,
726772
'bias': f'{sync_bias_ms:0.3f}',
773+
'conf': f'{sync_confidence:0.4f}',
774+
'interquintile': f"{fp.get('conv_quint', None)}",
775+
'stdev': f"{fp.get('conv_stdev', None)}",
727776
'paradigm': guess_paradigm(sync_bias_ms, **kwargs),
728777
'timestamp': timestamp(),
729778
'sample_rate': fp.get('sample_rate', None)
@@ -755,7 +804,8 @@ def batch_adjust(fingerprints, target_bias, **params):
755804
if affect_rows is not None and i not in affect_rows:
756805
continue
757806
current_paradigm = fingerprints[k].get('bias_adjust', guess_paradigm(fingerprints[k]['bias_result'], **params))
758-
if current_paradigm == source_bias:
807+
current_confidence = fingerprints[k].get('confidence', 100)
808+
if current_paradigm == source_bias and current_confidence >= params.get('confidence_limit', 0):
759809
logging.info(f'\t{k}')
760810
# Open simfile
761811
p, abbr = os.path.split(k)
@@ -782,7 +832,7 @@ def batch_adjust(fingerprints, target_bias, **params):
782832
steps_type, chart_slot, chart_index = slot_expansion(abbr)
783833
if chart_index is None:
784834
chart_index = [i for i, c in enumerate(sm.charts) if c['STEPSTYPE'] == steps_type and c['DIFFICULTY'] == chart_slot][0]
785-
prev_offset = float(sm.charts[chart_index]['OFFSET'])
835+
prev_offset = float(sm.charts[chart_index].get('OFFSET', sm.offset))
786836
new_offset = prev_offset + bias_shift
787837
logging.info(f'\t{prev_offset:6.3f} -> {new_offset:6.3f}: {k}')
788838
sm.charts[chart_index]['OFFSET'] = f'{new_offset:0.3f}'
@@ -793,7 +843,7 @@ def batch_adjust(fingerprints, target_bias, **params):
793843
if gui_hook is not None:
794844
font_cell = gui_hook.grid_results.GetCellFont(i, 0)
795845
gui_hook.grid_results.SetCellValue(i, 2, f"{fingerprints[k]['bias_result']:+0.1f}")
796-
gui_hook.grid_results.SetCellValue(i, 3, target_bias)
846+
gui_hook.grid_results.SetCellValue(i, 4, target_bias)
797847
for j in range(gui_hook.grid_results.GetNumberCols()):
798848
gui_hook.grid_results.SetCellFont(i, j, font_cell.MakeBold())
799849

0 commit comments

Comments
 (0)