Skip to content

Commit 96853bc

Browse files
suekaCh4s3
authored andcommitted
Fix the corner cases (#173)
* Avoid zero division * Fix a bug caused by indeterminate form * Fix a bug occurring when a zero vector try normalizing itself * Fix a bug * Revert "Fix a bug" This reverts commit 151e9f8. * Revert "Revert "Fix a bug"" This reverts commit 02b98d8. * add a test that may raise ZeroDivisionError * add a test for zero vectors * add a test that may raise Vector::ZeroVectorError * fix cutoff
1 parent 9b1e6c5 commit 96853bc

File tree

6 files changed

+41
-3
lines changed

6 files changed

+41
-3
lines changed

lib/classifier-reborn/extensions/vector.rb

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,11 @@ def SV_decomp(maxSweeps = 20)
3131
(1..qrot.row_size - 1).each do |col|
3232
next if row == col
3333

34-
h = Math.atan((2 * qrot[row, col]) / (qrot[row, row] - qrot[col, col])) / 2.0
34+
if (2.0 * qrot[row, col]) == (qrot[row, row] - qrot[col, col])
35+
h = Math.atan(1) / 2.0
36+
else
37+
h = Math.atan((2.0 * qrot[row, col]) / (qrot[row, row] - qrot[col, col])) / 2.0
38+
end
3539
hcos = Math.cos(h)
3640
hsin = Math.sin(h)
3741
mzrot = Matrix.identity(qrot.row_size)
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
class Vector
2+
def zero?
3+
self.all? {|_| _ == 0}
4+
end
5+
end

lib/classifier-reborn/lsi.rb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
rescue LoadError
1313
$GSL = false
1414
require_relative 'extensions/vector'
15+
require_relative 'extensions/zero_vector'
1516
end
1617

1718
require_relative 'lsi/word_list'
@@ -142,9 +143,13 @@ def build_index(cutoff = 0.75)
142143
tdm = Matrix.rows(tda).trans
143144
ntdm = build_reduced_matrix(tdm, cutoff)
144145

145-
ntdm.row_size.times do |col|
146+
ntdm.column_size.times do |col|
146147
doc_list[col].lsi_vector = ntdm.column(col) if doc_list[col]
147-
doc_list[col].lsi_norm = ntdm.column(col).normalize if doc_list[col]
148+
if ntdm.column(col).zero?
149+
doc_list[col].lsi_norm = ntdm.column(col) if doc_list[col]
150+
else
151+
doc_list[col].lsi_norm = ntdm.column(col).normalize if doc_list[col]
152+
end
148153
end
149154
end
150155

test/extensions/matrix_test.rb

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
class MatrixTest < Minitest::Test
2+
def test_zero_division
3+
matrix = Matrix[[1, 0], [0, 1]]
4+
matrix.SV_decomp
5+
end
6+
end

test/extensions/zero_vector_test.rb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
class ZeroVectorTest < Minitest::Test
2+
def test_zero?
3+
vec0 = Vector[]
4+
vec1 = Vector[0]
5+
vec10 = Vector.elements [0] * 10
6+
assert vec0.zero?
7+
assert vec1.zero?
8+
assert vec10.zero?
9+
end
10+
end

test/lsi/lsi_test.rb

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,14 @@ def test_not_auto_rebuild
3030
assert !lsi.needs_rebuild?
3131
end
3232

33+
def test_zero_vector_normalization
34+
lsi = ClassifierReborn::LSI.new auto_rebuild: false
35+
lsi.add_item @str1[0...8], 'Dog'
36+
lsi.add_item @str2, 'Dog'
37+
lsi.add_item @str3, 'Cat'
38+
lsi.build_index(0.75)
39+
end
40+
3341
def test_basic_categorizing
3442
lsi = ClassifierReborn::LSI.new
3543
lsi.add_item @str2, 'Dog'

0 commit comments

Comments
 (0)