在data文件夹下有4个文件,a.txt,b.txt,c.txt,d.txt.每个文件的内容就是它的文件名,比如a.txt内容就是a.
idx文件夹就是索引存放的文件夹
lucene的索引:
[code="java"]public class Indexer {
public static void main(String[] args) throws Exception {
Indexer indexer = new Indexer();
indexer.index(new File("idx"), new File("data"));
}
public void index(File index, File data) throws Exception {
IndexWriter indexWriter = new IndexWriter(index,
new StandardAnalyzer(), true);
indexWriter.setUseCompoundFile(false);
indexDirectory(indexWriter, data);
indexWriter.optimize();
indexWriter.close();
}
public void indexDirectory(IndexWriter indexWriter, File data)
throws IOException {
if (data.isFile()) {
indexFile(indexWriter, data);
} else if (data.isDirectory()) {
File[] files = data.listFiles();
for (File file : files) {
indexDirectory(indexWriter, file);
}
}
}
public void indexFile(IndexWriter indexWriter, File data)
throws IOException {
Document doc = new Document();
doc.add(Field.Text("contents", new FileReader(data)));
doc.add(Field.Keyword("filename", data.getCanonicalPath()));
indexWriter.addDocument(doc);
}
}
[/code]
lucene的search:
[code="java"]public class Searcher {
public static void main(String[] args) throws Exception {
Searcher searcher = new Searcher();
searcher.search(new File("idx"), "a");
}
public void search(File index, String str) throws Exception {
Directory directory = FSDirectory.getDirectory(index, false);
IndexSearcher indexSearcher = new IndexSearcher(directory);
Query query = QueryParser
.parse(str, "contents", new StandardAnalyzer());
Hits hits = indexSearcher.search(query);
System.out.println(hits.length());
for (int i = 0; i < hits.length(); i++) {
Document document = hits.doc(i);
System.out.println(document.getField("contents"));
System.out.println(document.getFields("filename"));
}
}
}
[/code]
索引文件和数据文件的路径都没有问题,已经经过测试,并且经过索引过后都可以产生索引文件,但是在searcher时,始终hit.length=0,不知道为什么?
(使用的是lucene-1.4)
[b]问题补充:[/b]
现在比较常用的analyzer是什么?a是stopword,但是假如写成cat,就应该能识别出这是2个词是c和t啊.可事实并非如此.lucene1.4虽然老了点,可仍然是使用较多的版本啊.