Skip to content

Commit bb971a2

Browse files
committed
Bump classifier version to 1.4.4 and improve LSI content node scaling
- Update classifier version from 1.4.3 to 1.4.4 in Gemfile.lock and gemspec - Enhance LSI content node scaling by considering unique words count - Add test case for classifying repeated words in LSI
1 parent 40f3215 commit bb971a2

File tree

4 files changed

+5
-3
lines changed

4 files changed

+5
-3
lines changed

Gemfile.lock

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
PATH
22
remote: .
33
specs:
4-
classifier (1.4.3)
4+
classifier (1.4.4)
55
fast-stemmer (~> 1.0)
66
mutex_m (~> 0.2)
77
rake

classifier.gemspec

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Gem::Specification.new do |s|
22
s.name = 'classifier'
3-
s.version = '1.4.3'
3+
s.version = '1.4.4'
44
s.summary = 'A general classifier module to allow Bayesian and other types of classifications.'
55
s.description = 'A general classifier module to allow Bayesian and other types of classifications.'
66
s.author = 'Lucas Carlson'

lib/classifier/lsi/content_node.rb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,11 @@ def raw_vector_with(word_list)
4545

4646
# Perform the scaling transform
4747
total_words = $GSL ? vec.sum : vec.sum_with_identity
48+
total_unique_words = vec.count { |word| word != 0 }
4849

4950
# Perform first-order association transform if this vector has more
5051
# than one word in it.
51-
if total_words > 1.0
52+
if total_words > 1.0 && total_unique_words > 1
5253
weighted_total = 0.0
5354

5455
vec.each do |term|

test/lsi/lsi_test.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ def test_basic_categorizing
4040
assert_equal 'Dog', lsi.classify(@str1)
4141
assert_equal 'Cat', lsi.classify(@str3)
4242
assert_equal 'Bird', lsi.classify(@str5)
43+
assert_equal 'Bird', lsi.classify('Bird me to Bird')
4344
end
4445

4546
def test_external_classifying

0 commit comments

Comments
 (0)