vectorization of a text file
Posted
by
Fox
on Stack Overflow
See other posts from Stack Overflow
or by Fox
Published on 2012-03-21T17:25:28Z
Indexed on
2012/03/21
17:29 UTC
Read the original article
Hit count: 352
java
|vectorization
I am trying to implement vectorization of a text file...I have created a dictionary (Unique words in all the documents) ... Which is the best way to implement this in java?
For example - My dictionary has the following words - {w1, w2, w3, w4} And I have 2 documents each having subset of the words in the vocabulary. I need to write to a text file the matrix in the form --
1,3,4,0
0,0,2,1
Here each row represents a document and the values represent the occurrence of each word in the document.
Can you suggest me the most efficient way to implement this in Java?
© Stack Overflow or respective owner