Skip to content

Commit 3571fa9

Browse files
committed
Whitespace
1 parent 5b45500 commit 3571fa9

File tree

12 files changed

+137
-137
lines changed

12 files changed

+137
-137
lines changed

LICENSE

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ such a program is covered only if its contents constitute a work based
146146
on the Library (independent of the use of the Library in a tool for
147147
writing it). Whether that is true depends on what the Library does
148148
and what the program that uses the Library does.
149-
149+
150150
1. You may copy and distribute verbatim copies of the Library's
151151
complete source code as you receive it, in any medium, provided that
152152
you conspicuously and appropriately publish on each copy an
@@ -426,4 +426,4 @@ the Free Software Foundation.
426426
14. If you wish to incorporate parts of the Library into other free
427427
programs whose distribution conditions are incompatible with these,
428428
write to the author to ask for permission. For software which is
429-
copyrighted by
429+
copyrighted by

README.markdown

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ A Bayesian classifier by Lucas Carlson. Bayesian Classifiers are accurate, fast,
3131
b.train_interesting "here are some good words. I hope you love them"
3232
b.train_uninteresting "here are some bad words, I hate you"
3333
b.classify "I hate bad words and you" # returns 'Uninteresting'
34-
34+
3535
require 'madeleine'
3636
m = SnapshotMadeleine.new("bayes_data") {
3737
Classifier::Bayes.new 'Interesting', 'Uninteresting'
@@ -52,8 +52,8 @@ Using Madeleine, your application can persist the learned data over time.
5252
## LSI
5353

5454
A Latent Semantic Indexer by David Fayram. Latent Semantic Indexing engines
55-
are not as fast or as small as Bayesian classifiers, but are more flexible, providing
56-
fast search and clustering detection as well as semantic analysis of the text that
55+
are not as fast or as small as Bayesian classifiers, but are more flexible, providing
56+
fast search and clustering detection as well as semantic analysis of the text that
5757
theoretically simulates human learning.
5858

5959
### Usage
@@ -66,27 +66,27 @@ theoretically simulates human learning.
6666
["This text also involves cats. Cats!", :cat],
6767
["This text involves birds. Birds.",:bird ]]
6868
strings.each {|x| lsi.add_item x.first, x.last}
69-
69+
7070
lsi.search("dog", 3)
71-
# returns => ["This text deals with dogs. Dogs.", "This text involves dogs too. Dogs! ",
71+
# returns => ["This text deals with dogs. Dogs.", "This text involves dogs too. Dogs! ",
7272
# "This text also involves cats. Cats!"]
7373

7474
lsi.find_related(strings[2], 2)
7575
# returns => ["This text revolves around cats. Cats.", "This text also involves cats. Cats!"]
76-
76+
7777
lsi.classify "This text is also about dogs!"
7878
# returns => :dog
79-
79+
8080
Please see the Classifier::LSI documentation for more information. It is possible to index, search and classify
81-
with more than just simple strings.
81+
with more than just simple strings.
8282

8383
### Latent Semantic Indexing
8484

8585
* http://www.c2.com/cgi/wiki?LatentSemanticIndexing
8686
* http://www.chadfowler.com/index.cgi/Computing/LatentSemanticIndexing.rdoc
8787
* http://en.wikipedia.org/wiki/Latent_semantic_analysis
8888

89-
## Authors
89+
## Authors
9090

9191
* Lucas Carlson ([email protected])
9292
* David Fayram II ([email protected])

install.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
files = FileList["**/*"]
3838

3939
# File::safe_unlink *deprecated.collect{|f| File.join($sitedir, f.split(/\//))}
40-
files.each {|f|
40+
files.each {|f|
4141
File::install(f, File.join($sitedir, *f.split(/\//)), 0644, true)
4242
}
4343

lib/classifier.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,4 @@
2727
require 'rubygems'
2828
require 'classifier/extensions/string'
2929
require 'classifier/bayes'
30-
require 'classifier/lsi'
30+
require 'classifier/lsi'

lib/classifier/bayes.rb

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ module Classifier
66

77
class Bayes
88
# The class can be created with one or more categories, each of which will be
9-
# initialized and given a training method. E.g.,
9+
# initialized and given a training method. E.g.,
1010
# b = Classifier::Bayes.new 'Interesting', 'Uninteresting', 'Spam'
1111
def initialize(*categories)
1212
@categories = Hash.new
@@ -56,7 +56,7 @@ def untrain(category, text)
5656
end
5757
end
5858
end
59-
59+
6060
#
6161
# Returns the scores in each category the provided +text+. E.g.,
6262
# b.classifications "I hate bad words and you"
@@ -80,14 +80,14 @@ def classifications(text)
8080
end
8181

8282
#
83-
# Returns the classification of the provided +text+, which is one of the
83+
# Returns the classification of the provided +text+, which is one of the
8484
# categories given in the initializer. E.g.,
8585
# b.classify "I hate bad words and you"
8686
# => 'Uninteresting'
8787
def classify(text)
8888
(classifications(text).sort_by { |a| -a[1] })[0][0]
8989
end
90-
90+
9191
#
9292
# Provides training and untraining methods for the categories specified in Bayes#new
9393
# For example:
@@ -106,7 +106,7 @@ def method_missing(name, *args)
106106
super #raise StandardError, "No such method: #{name}"
107107
end
108108
end
109-
109+
110110
#
111111
# Provides a list of category names
112112
# For example:
@@ -115,7 +115,7 @@ def method_missing(name, *args)
115115
def categories # :nodoc:
116116
@categories.keys.collect {|c| c.to_s}
117117
end
118-
118+
119119
#
120120
# Allows you to add categories to the classifier.
121121
# For example:
@@ -128,7 +128,7 @@ def categories # :nodoc:
128128
def add_category(category)
129129
@categories[category.prepare_category_name] = Hash.new
130130
end
131-
131+
132132
alias append_category add_category
133133
end
134134

lib/classifier/extensions/vector.rb

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Author:: Ernest Ellingson
2-
# Copyright:: Copyright (c) 2005
2+
# Copyright:: Copyright (c) 2005
33

44
# These are extensions to the std-lib 'matrix' to allow an all ruby SVD
55

@@ -9,7 +9,7 @@
99
class Array
1010
def sum(identity = 0, &block)
1111
return identity unless size > 0
12-
12+
1313
if block_given?
1414
map(&block).sum
1515
else
@@ -22,7 +22,7 @@ class Vector
2222
def magnitude
2323
sumsqs = 0.0
2424
self.size.times do |i|
25-
sumsqs += self[i] ** 2.0
25+
sumsqs += self[i] ** 2.0
2626
end
2727
Math.sqrt(sumsqs)
2828
end
@@ -42,7 +42,7 @@ class Matrix
4242
def Matrix.diag(s)
4343
Matrix.diagonal(*s)
4444
end
45-
45+
4646
alias :trans :transpose
4747

4848
def SV_decomp(maxSweeps = 20)
@@ -51,7 +51,7 @@ def SV_decomp(maxSweeps = 20)
5151
else
5252
q = self * self.trans
5353
end
54-
54+
5555
qrot = q.dup
5656
v = Matrix.identity(q.row_size)
5757
azrot = nil
@@ -75,16 +75,16 @@ def SV_decomp(maxSweeps = 20)
7575
mzrot[col,col] = hcos
7676
qrot = mzrot.trans * qrot * mzrot
7777
v = v * mzrot
78-
end
78+
end
7979
end
8080
s_old = qrot.dup if cnt == 1
81-
sum_qrot = 0.0
81+
sum_qrot = 0.0
8282
if cnt > 1
8383
qrot.row_size.times do |r|
8484
sum_qrot += (qrot[r,r]-s_old[r,r]).abs if (qrot[r,r]-s_old[r,r]).abs > 0.001
8585
end
8686
s_old = qrot.dup
87-
end
87+
end
8888
break if (sum_qrot <= 0.001 and cnt > 1) or cnt >= maxSweeps
8989
end # of do while true
9090
s = []
@@ -93,7 +93,7 @@ def SV_decomp(maxSweeps = 20)
9393
end
9494
#puts "cnt = #{cnt}"
9595
if self.row_size >= self.column_size
96-
mu = self * v * Matrix.diagonal(*s).inverse
96+
mu = self * v * Matrix.diagonal(*s).inverse
9797
return [mu, v, s]
9898
else
9999
puts v.row_size

lib/classifier/extensions/vector_serialize.rb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
module GSL
2-
2+
33
class Vector
44
def _dump(v)
55
Marshal.dump( self.to_a )
66
end
7-
7+
88
def self._load(arr)
99
arry = Marshal.load(arr)
1010
return GSL::Vector.alloc(arry)
1111
end
12-
12+
1313
end
14-
14+
1515
class Matrix
1616
class <<self
1717
alias :diag :diagonal

lib/classifier/extensions/word_hash.rb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,20 @@
44

55
require "set"
66

7-
# These are extensions to the String class to provide convenience
7+
# These are extensions to the String class to provide convenience
88
# methods for the Classifier package.
99
class String
10-
11-
# Removes common punctuation symbols, returning a new string.
10+
11+
# Removes common punctuation symbols, returning a new string.
1212
# E.g.,
1313
# "Hello (greeting's), with {braces} < >...?".without_punctuation
1414
# => "Hello greetings with braces "
1515
def without_punctuation
1616
tr( ',?.!;:"@#$%^&*()_=+[]{}\|<>/`~', " " ) .tr( "'\-", "")
1717
end
18-
18+
1919
# Return a Hash of strings => ints. Each word in the string is stemmed,
20-
# interned, and indexes to its frequency in the document.
20+
# interned, and indexes to its frequency in the document.
2121
def word_hash
2222
word_hash = clean_word_hash()
2323
symbol_hash = word_hash_for_symbols(gsub(/[\w]/," ").split)
@@ -28,9 +28,9 @@ def word_hash
2828
def clean_word_hash
2929
word_hash_for_words gsub(/[^\w\s]/,"").split
3030
end
31-
31+
3232
private
33-
33+
3434
def word_hash_for_words(words)
3535
d = Hash.new(0)
3636
words.each do |word|
@@ -50,7 +50,7 @@ def word_hash_for_symbols(words)
5050
end
5151
return d
5252
end
53-
53+
5454
CORPUS_SKIP_WORDS = Set.new([
5555
"a",
5656
"again",

0 commit comments

Comments
 (0)