在WebSphere sMash中集成Java和PHP(6)

 9、接下来,将以下 PHP 代码添加到 index.php 中:

<?php

$directory = dirname(__FILE__)."/../index";
if (file_exists($directory) === FALSE) {
mkdir($directory);
}

define("INDEX_DIRECTORY", $directory);

try {
$extension = zget('/request/params/extension');
if (strlen($extension) > 0) {
$directory = zget('/request/params/directory');
if (strlen($directory) > 0) {
index_directory($directory, $extension);
}
}
} catch (JavaException $exception) {
echo "Index creation failed [".
$exception->getMessage()."]</br>";
}
?>

 

10、尚不要运行,因为还没有完成!代码从全局上下文获取表单变量,并检查是否已经填充。如果已经跳出,则会调用 index_directory 函数。此函数将在后面进行说明,负责将任何匹配的文件添加到 Lucene 搜索引擎。

11、接下来,将以下 PHP 代码添加到 index.php 中:

/**
* This creates an index from scratch and adds all the documents
* by recursing from the directory passed in. It also checks
* each candidate file to see if it matches the file extension.
*/
function index_directory($path, $extension) {
    echo "Indexing! [".$path.",".$extension."]</br>";
   
// Uses the SimpleAnalyzer because we will do a performance comparison
          with the PHP
// implementation of Lucene in the Zend Framework and it is the closest match
$analyser = new Java("org.apache.lucene.analysis.SimpleAnalyzer");
$policy = new Java("org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy");

$file = new Java("java.io.File", INDEX_DIRECTORY, FALSE);
$file_directory = new JavaClass("org.apache.lucene.store.FSDirectory");
$directory = $file_directory->getDirectory($file);

$writer = new Java("org.apache.lucene.index.IndexWriter",
$directory, TRUE, $analyser, TRUE, $policy);

$writer->setUseCompoundFile(FALSE);

// Insert some calls to microtime() for comparison
$start_time = get_microtime();
recursive_index_directory($writer, $path, $extension);
$count = $writer->docCount();

// Lucene only matches the first 10,000 tokens by default
$writer->setMaxFieldLength(1000000);
$end_index_time = get_microtime();

$writer->optimize();
$end_time = get_microtime();
$writer->close();

echo "Finished indexing [".$count." documents]</br>";
$t1 = $end_index_time - $start_time;
$t2 = $end_time - $end_index_time;
echo "Time to index  = $t1 </br>";
echo "Time to optimize  = $t2 </br>";
}

 

本文将不会介绍 Java Lucene API 的详细信息。简单来说,此代码用于创建 IndexWriter 对象。这是键索引对象,当脚本在目录中进行递归时,文件添加到其中。请注意,可以根据很多不同的源(例如,RAM 磁盘)进行反向索引。在此示例中,从常规文件系统读取文件,因此将使用 FSDirectory 类。

一旦设置 IndexWriter,脚本将调用 recursive_index_directory 来实际进行索引工作。此函数传递 IndexWriter,即起始目录以及候选文件要匹配的文件扩展名。

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/wzywsw.html