corseek 中文检索时搜不出结果 搜英文单词正常

[root@abc testpack]# /usr/local/coreseek/bin/indexer -c etc/sphinx.conf --all Coreseek Fulltext 4.1 [ Sphinx 2.0.2-dev (r2922)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)

using config file 'etc/sphinx.conf'...
indexing index 'test1'...
WARNING: Attribute count is 0: switching to none docinfo
collected 5 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 5 docs, 186 bytes
total 0.064 sec, 2870 bytes/sec, 77.16 docs/sec
total 2 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 6 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
检索中文 不出结果
[root@abc testpack]# /usr/local/coreseek/bin/search -c etc/sphinx.conf '水火不容'
Coreseek Fulltext 4.1 [ Sphinx 2.0.2-dev (r2922)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)

using config file 'etc/sphinx.conf'...
index 'test1': query '水火不容 ': returned 0 matches of 0 total in 0.000 sec

words:
1. '水火': 0 documents, 0 hits
2. '不容': 0 documents, 0 hits

检索英文就能出结果
[root@abc testpack]# /usr/local/coreseek/bin/search -c etc/sphinx.conf 'apple'
Coreseek Fulltext 4.1 [ Sphinx 2.0.2-dev (r2922)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)

using config file 'etc/sphinx.conf'...
index 'test1': query 'apple ': returned 1 matches of 1 total in 0.001 sec

displaying matches:
1. document=5, weight=2780
id=5
title=apple
content=apple,banana

words:
1. 'apple': 1 documents, 2 hits

这个是数据库
mysql> select * from tt;
+----+--------------+-----------------+
| id | title | content |
+----+--------------+-----------------+
| 1 | 西水 | 水水 |
| 2 | 水火不容 | 水火不容 |
| 3 | 水啊啊 | 啊水货 |
| 4 | 东南西水 | 啊西西哈哈 |
| 5 | apple | apple,banana |
+----+--------------+-----------------+
5 rows in set (0.00 sec)

下面是配置那个文件
#

Sphinx configuration file sample

#

WARNING! While this sample file mentions all available options,

it contains (very) short helper descriptions only. Please refer to

doc/sphinx.html for details.

#

#############################################################################

data source definition

#############################################################################

source src1
{
# data source type. mandatory, no default value
# known types are mysql, pgsql, mssql, xmlpipe, xmlpipe2, odbc
type = mysql

#####################################################################
## SQL settings (for 'mysql' and 'pgsql' types)
#####################################################################

# some straightforward parameters for SQL source types
sql_host        = localhost
sql_user        = root
sql_pass        = 123456
sql_db          = haha
sql_port        = 3306  # optional, default is 3306

# UNIX socket name
# optional, default is empty (reuse client library defaults)
# usually '/var/lib/mysql/mysql.sock' on Linux
# usually '/tmp/mysql.sock' on FreeBSD
#
 sql_sock       = /var/lib/mysql/mysql.sock


# MySQL specific client connection flags
# optional, default is 0
#
# mysql_connect_flags   = 32 # enable compression

# MySQL specific SSL certificate settings
# optional, defaults are empty
#
# mysql_ssl_cert        = /etc/ssl/client-cert.pem
# mysql_ssl_key     = /etc/ssl/client-key.pem
# mysql_ssl_ca      = /etc/ssl/cacert.pem

# MS SQL specific Windows authentication mode flag
# MUST be in sync with charset_type index-level setting
# optional, default is 0
#
# mssql_winauth     = 1 # use currently logged on user credentials


# MS SQL specific Unicode indexing flag
# optional, default is 0 (request SBCS data)
#
# mssql_unicode     = 1 # request Unicode data from server


# ODBC specific DSN (data source name)
# mandatory for odbc source type, no default value
#
# odbc_dsn      = DBQ=C:\data;DefaultDir=C:\data;Driver={Microsoft Text Driver (*.txt; *.csv)};
# sql_query     = SELECT id, data FROM documents.csv


# ODBC and MS SQL specific, per-column buffer sizes
# optional, default is auto-detect
#
# sql_column_buffers    = content=12M, comments=1M


# pre-query, executed before the main fetch query
# multi-value, optional, default is empty list of queries
#
 sql_query_pre      = SET NAMES utf8
 sql_query_pre      = SET SESSION query_cache_type=OFF


# main document fetch query
# mandatory, integer document ID field MUST be the first selected column
sql_query       = \
    SELECT id, title, content FROM tt


# joined/payload field fetch query
# joined fields let you avoid (slow) JOIN and GROUP_CONCAT
# payload fields let you attach custom per-keyword values (eg. for ranking)
#
# syntax is FIELD-NAME 'from'  ( 'query' | 'payload-query' ); QUERY
# joined field QUERY should return 2 columns (docid, text)
# payload field QUERY should return 3 columns (docid, keyword, weight)
#
# REQUIRES that query results are in ascending document ID order!
# multi-value, optional, default is empty list of queries
#
# sql_joined_field  = tags from query; SELECT docid, CONCAT('tag',tagid) FROM tags ORDER BY docid ASC
# sql_joined_field  = wtags from payload-query; SELECT docid, tag, tagweight FROM tags ORDER BY docid ASC


# file based field declaration
#
# content of this field is treated as a file name
# and the file gets loaded and indexed in place of a field
#
# max file size is limited by max_file_field_buffer indexer setting
# file IO errors are non-fatal and get reported as warnings
#
# sql_file_field        = content_file_path
    # sql_query_info        = SELECT * FROM tt  WHERE id=$id

# range query setup, query that must return min and max ID values
# optional, default is empty
#
# sql_query will need to reference $start and $end boundaries
# if using ranged query:
#
# sql_query     = \
#   SELECT doc.id, doc.id AS group, doc.title, doc.data \
#   FROM documents doc \
#   WHERE id>=$start AND id<=$end
#
# sql_query_range       = SELECT MIN(id),MAX(id) FROM documents


# range query step
# optional, default is 1024
#
# sql_range_step        = 1000


# unsigned integer attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# optional bit size can be specified, default is 32
#
# sql_attr_uint     = author_id
# sql_attr_uint     = forum_id:9 # 9 bits for forum_id
#sql_attr_uint      = group_id

# boolean attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# equivalent to sql_attr_uint with 1-bit size
#
# sql_attr_bool     = is_deleted


# bigint attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# declares a signed (unlike uint!) 64-bit attribute
#
# sql_attr_bigint       = my_bigint_id


# UNIX timestamp attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# similar to integer, but can also be used in date functions
#
# sql_attr_timestamp    = posted_ts
# sql_attr_timestamp    = last_edited_ts
#sql_attr_timestamp = date_added

# string ordinal attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# sorts strings (bytewise), and stores their indexes in the sorted list
# sorting by this attr is equivalent to sorting by the original strings
#
# sql_attr_str2ordinal  = author_name


# floating point attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# values are stored in single precision, 32-bit IEEE 754 format
#
# sql_attr_float        = lat_radians
# sql_attr_float        = long_radians


# multi-valued attribute (MVA) attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# MVA values are variable length lists of unsigned 32-bit integers
#
# syntax is ATTR-TYPE ATTR-NAME 'from' SOURCE-TYPE [;QUERY] [;RANGE-QUERY]
# ATTR-TYPE is 'uint' or 'timestamp'
# SOURCE-TYPE is 'field', 'query', or 'ranged-query'
# QUERY is SQL query used to fetch all ( docid, attrvalue ) pairs
# RANGE-QUERY is SQL query used to fetch min and max ID values, similar to 'sql_query_range'
#
# sql_attr_multi        = uint tag from query; SELECT docid, tagid FROM tags
# sql_attr_multi        = uint tag from ranged-query; \
#   SELECT docid, tagid FROM tags WHERE id>=$start AND id<=$end; \
#   SELECT MIN(docid), MAX(docid) FROM tags


# string attribute declaration
# multi-value (an arbitrary number of these is allowed), optional
# lets you store and retrieve strings
#
# sql_attr_string       = stitle


# wordcount attribute declaration
# multi-value (an arbitrary number of these is allowed), optional
# lets you count the words at indexing time
#
# sql_attr_str2wordcount    = stitle


# combined field plus attribute declaration (from a single column)
# stores column as an attribute, but also indexes it as a full-text field
#
# sql_field_string  = author
# sql_field_str2wordcount   = title


# post-query, executed on sql_query completion
# optional, default is empty
#
# sql_query_post        =


# post-index-query, executed on successful indexing completion
# optional, default is empty
# $maxid expands to max document ID actually fetched from DB
#
# sql_query_post_index  = REPLACE INTO counters ( id, val ) \
#   VALUES ( 'max_indexed_id', $maxid )


# ranged query throttling, in milliseconds
# optional, default is 0 which means no delay
# enforces given delay before each query step
sql_ranged_throttle = 0

# document info query, ONLY for CLI search (ie. testing and debugging)
# optional, default is empty
# must contain $id macro and must fetch the document by that id
sql_query_info      = SELECT * FROM tt WHERE id=$id

# kill-list query, fetches the document IDs for kill-list
# k-list will suppress matches from preceding indexes in the same query
# optional, default is empty
#
# sql_query_killlist    = SELECT id FROM documents WHERE edited>=@last_reindex


# columns to unpack on indexer side when indexing
# multi-value, optional, default is empty list
#
# unpack_zlib       = zlib_column
# unpack_mysqlcompress  = compressed_column
# unpack_mysqlcompress  = compressed_column_2


# maximum unpacked length allowed in MySQL COMPRESS() unpacker
# optional, default is 16M
#
# unpack_mysqlcompress_maxsize  = 16M


#####################################################################
## xmlpipe2 settings
#####################################################################

# type          = xmlpipe

# shell command to invoke xmlpipe stream producer
# mandatory
#
# xmlpipe_command       = cat /usr/local/coreseek/var/test.xml

# xmlpipe2 field declaration
# multi-value, optional, default is empty
#
# xmlpipe_field     = subject
# xmlpipe_field     = content


# xmlpipe2 attribute declaration
# multi-value, optional, default is empty
# all xmlpipe_attr_XXX options are fully similar to sql_attr_XXX
#
# xmlpipe_attr_timestamp    = published
# xmlpipe_attr_uint = author_id


# perform UTF-8 validation, and filter out incorrect codes
# avoids XML parser choking on non-UTF-8 documents
# optional, default is 0
#
# xmlpipe_fixup_utf8    = 1

}

inherited source example

#

all the parameters are copied from the parent source,

and may then be overridden in this source definition

source src1throttled : src1
{
sql_ranged_throttle = 100
}

#############################################################################

index definition

#############################################################################

local index example

#

this is an index which is stored locally in the filesystem

#

all indexing-time options (such as morphology and charsets)

are configured per local index

index test1
{
# index type
# optional, default is 'plain'
# known values are 'plain', 'distributed', and 'rt' (see samples below)
# type = plain

# document source(s) to index
# multi-value, mandatory
# document IDs must be globally unique across all sources
source          = src1

# index files path and file name, without extension
# mandatory, path must be writable, extensions will be auto-appended
#path           = /usr/local/coreseek/var/data/test1

# document attribute values (docinfo) storage mode
# optional, default is 'extern'
# known values are 'none', 'extern' and 'inline'
docinfo         = extern

# memory locking for cached data (.spa and .spi), to prevent swapping
# optional, default is 0 (do not mlock)
# requires searchd to be run from root
mlock           = 0

# a list of morphology preprocessors to apply
# optional, default is empty
#
# builtin preprocessors are 'none', 'stem_en', 'stem_ru', 'stem_enru',
# 'soundex', and 'metaphone'; additional preprocessors available from
# libstemmer are 'libstemmer_XXX', where XXX is algorithm code
# (see libstemmer_c/libstemmer/modules.txt)
#
# morphology        = stem_en, stem_ru, soundex
# morphology        = libstemmer_german
# morphology        = libstemmer_sv
morphology      = none

# minimum word length at which to enable stemming
# optional, default is 1 (stem everything)
#
# min_stemming_len  = 1

path = /root/rearch_dir
# stopword files list (space separated)
# optional, default is empty
# contents are plain text, charset_table and stemming are both applied
#
# stopwords = /usr/local/coreseek/var/data/stopwords.txt

# wordforms file, in "mapfrom > mapto" plain text format
# optional, default is empty
#
# wordforms     = /usr/local/coreseek/var/data/wordforms.txt


# tokenizing exceptions file
# optional, default is empty
#
# plain text, case sensitive, space insensitive in map-from part
# one "Map Several Words => ToASingleOne" entry per line
#
# exceptions        = /usr/local/coreseek/var/data/exceptions.txt


# minimum indexed word length
# default is 1 (index everything)
min_word_len        = 1

# charset encoding type
# optional, default is 'sbcs'
# known types are 'sbcs' (Single Byte CharSet) and 'utf-8'
charset_type        = zh_cn.utf-8
    charset_dictpath        = /usr/local/mmseg3/etc/
# charset definition and case folding rules "table"
# optional, default value depends on charset_type
#
# defaults are configured to include English and Russian characters only
# you need to change the table to include additional ones
# this behavior MAY change in future versions
#
# 'sbcs' default value is
# charset_table     = 0..9, A..Z->a..z, _, a..z, U+A8->U+B8, U+B8, U+C0..U+DF->U+E0..U+FF, U+E0..U+FF
#
# 'utf-8' default value is
#charset_table      = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F


# ignored characters list
# optional, default value is empty
#
# ignore_chars      = U+00AD


# minimum word prefix length to index
# optional, default is 0 (do not index prefixes)
#
# min_prefix_len        = 0


# minimum word infix length to index
# optional, default is 0 (do not index infixes)
#
# min_infix_len     = 0


# list of fields to limit prefix/infix indexing to
# optional, default value is empty (index all fields in prefix/infix mode)
#
# prefix_fields     = filename
# infix_fields      = url, domain


# enable star-syntax (wildcards) when searching prefix/infix indexes
# search-time only, does not affect indexing, can be 0 or 1
# optional, default is 0 (do not use wildcard syntax)
#
# enable_star       = 1


# expand keywords with exact forms and/or stars when searching fit indexes
# search-time only, does not affect indexing, can be 0 or 1
# optional, default is 0 (do not expand keywords)
#
# expand_keywords       = 1


# n-gram length to index, for CJK indexing
# only supports 0 and 1 for now, other lengths to be implemented
# optional, default is 0 (disable n-grams)
#
 ngram_len      = 0


# n-gram characters list, for CJK indexing
# optional, default is empty
#
# ngram_chars       = U+3000..U+2FA1F


# phrase boundary characters list
# optional, default is empty
#
# phrase_boundary       = ., ?, !, U+2026 # horizontal ellipsis


# phrase boundary word position increment
# optional, default is 0
#
# phrase_boundary_step  = 100


# blended characters list
# blended chars are indexed both as separators and valid characters
# for instance, AT&T will results in 3 tokens ("at", "t", and "at&t")
# optional, default is empty
#
# blend_chars       = +, &, U+23


# blended token indexing mode
# a comma separated list of blended token indexing variants
# known variants are trim_none, trim_head, trim_tail, trim_both, skip_pure
# optional, default is trim_none
#
# blend_mode        = trim_tail, skip_pure


# whether to strip HTML tags from incoming documents
# known values are 0 (do not strip) and 1 (do strip)
# optional, default is 0
html_strip      = 0

# what HTML attributes to index if stripping HTML
# optional, default is empty (do not index anything)
#
# html_index_attrs  = img=alt,title; a=title;


# what HTML elements contents to strip
# optional, default is empty (do not strip element contents)
#
# html_remove_elements  = style, script


# whether to preopen index data files on startup
# optional, default is 0 (do not preopen), searchd-only
#
# preopen           = 1


# whether to keep dictionary (.spi) on disk, or cache it in RAM
# optional, default is 0 (cache in RAM), searchd-only
#
# ondisk_dict       = 1


# whether to enable in-place inversion (2x less disk, 90-95% speed)
# optional, default is 0 (use separate temporary files), indexer-only
#
# inplace_enable        = 1


# in-place fine-tuning options
# optional, defaults are listed below
#
# inplace_hit_gap       = 0 # preallocated hitlist gap size
# inplace_docinfo_gap   = 0 # preallocated docinfo gap size
# inplace_reloc_factor  = 0.1 # relocation buffer size within arena
# inplace_write_factor  = 0.1 # write buffer size within arena


# whether to index original keywords along with stemmed versions
# enables "=exactform" operator to work
# optional, default is 0
#
# index_exact_words = 1


# position increment on overshort (less that min_word_len) words
# optional, allowed values are 0 and 1, default is 1
#
# overshort_step        = 1


# position increment on stopword
# optional, allowed values are 0 and 1, default is 1
#
# stopword_step     = 1


# hitless words list
# positions for these keywords will not be stored in the index
# optional, allowed values are 'all', or a list file name
#
# hitless_words     = all
# hitless_words     = hitless.txt


# detect and index sentence and paragraph boundaries
# required for the SENTENCE and PARAGRAPH operators to work
# optional, allowed values are 0 and 1, default is 0
#
# index_sp          = 1


# index zones, delimited by HTML/XML tags
# a comma separated list of tags and wildcards
# required for the ZONE operator to work
# optional, default is empty string (do not index zones)
#
# index_zones       = title, h*, th

}

inherited index example

#

all the parameters are copied from the parent index,

and may then be overridden in this index definition

#index test1stemmed : test1
#{

path = /usr/local/coreseek/var/data/test1stemmed

morphology = stem_en

#}

distributed index example

#

this is a virtual index which can NOT be directly indexed,

and only contains references to other local and/or remote indexes

#index dist1
#{
# 'distributed' index type MUST be specified

type = distributed

# local index to be searched
# there can be many local indexes configured

local = test1

local = test1stemmed

# remote agent
# multiple remote agents may be specified
# syntax for TCP connections is 'hostname:port:index1,[index2[,...]]'
# syntax for local UNIX connections is '/path/to/socket:index1,[index2[,...]]'

agent = localhost:9313:remote1

agent = localhost:9314:remote2,remote3

# agent         = /var/run/searchd.sock:remote4

# blackhole remote agent, for debugging/testing
# network errors and search results will be ignored
#
# agent_blackhole       = testbox:9312:testindex1,testindex2


# remote agent connection timeout, milliseconds
# optional, default is 1000 ms, ie. 1 sec

agent_connect_timeout = 1000

# remote agent query timeout, milliseconds
# optional, default is 3000 ms, ie. 3 sec

agent_query_timeout = 3000

#}

realtime index example

#

you can run INSERT, REPLACE, and DELETE on this index on the fly

using MySQL protocol (see 'listen' directive below)

#index rt
#{
# 'rt' index type must be specified to use RT index
#type = rt

# index files path and file name, without extension
# mandatory, path must be writable, extensions will be auto-appended

path = /usr/local/coreseek/var/data/rt

# RAM chunk size limit
# RT index will keep at most this much data in RAM, then flush to disk
# optional, default is 32M
#
# rt_mem_limit      = 512M

# full-text field declaration
# multi-value, mandatory

rt_field = title

rt_field = content

# unsigned integer attribute declaration
# multi-value (an arbitrary number of attributes is allowed), optional
# declares an unsigned 32-bit attribute

rt_attr_uint = gid

# RT indexes currently support the following attribute types:
# uint, bigint, float, timestamp, string
#
# rt_attr_bigint        = guid
# rt_attr_float     = gpa
# rt_attr_timestamp = ts_added
# rt_attr_string        = content

#}

#############################################################################

indexer settings

#############################################################################

indexer
{
# memory limit, in bytes, kiloytes (16384K) or megabytes (256M)
# optional, default is 32M, max is 2047M, recommended is 256M to 1024M
mem_limit = 256M

# maximum IO calls per second (for I/O throttling)
# optional, default is 0 (unlimited)
#
# max_iops      = 40


# maximum IO call size, bytes (for I/O throttling)
# optional, default is 0 (unlimited)
#
# max_iosize        = 1048576


# maximum xmlpipe2 field length, bytes
# optional, default is 2M
#
# max_xmlpipe2_field    = 4M


# write buffer size, bytes
# several (currently up to 4) buffers will be allocated
# write buffers are allocated in addition to mem_limit
# optional, default is 1M
#
# write_buffer      = 1M


# maximum file field adaptive buffer size
# optional, default is 8M, minimum is 1M
#
# max_file_field_buffer = 32M

}

#############################################################################

searchd settings

#############################################################################

searchd
{
# [hostname:]port[:protocol], or /unix/socket/path to listen on
# known protocols are 'sphinx' (SphinxAPI) and 'mysql41' (SphinxQL)
#
# multi-value, multiple listen points are allowed
# optional, defaults are 9312:sphinx and 9306:mysql41, as below
#
# listen = 127.0.0.1
# listen = 192.168.0.1:9312
# listen = 9312
# listen = /var/run/searchd.sock
listen = 9312
#listen = 9306:mysql41

# log file, searchd run info is logged here
# optional, default is 'searchd.log'
log         = /usr/local/coreseek/var/log/searchd.log

# query log file, all search queries are logged here
# optional, default is empty (do not log queries)
query_log       = /usr/local/coreseek/var/log/query.log

# client read timeout, seconds
# optional, default is 5
read_timeout        = 5

# request timeout, seconds
# optional, default is 5 minutes
client_timeout      = 300

# maximum amount of children to fork (concurrent searches to run)
# optional, default is 0 (unlimited)
max_children        = 30

# PID file, searchd process ID file name
# mandatory
pid_file        = /usr/local/coreseek/var/log/searchd.pid

# max amount of matches the daemon ever keeps in RAM, per-index
# WARNING, THERE'S ALSO PER-QUERY LIMIT, SEE SetLimits() API CALL
# default is 1000 (just like Google)
max_matches     = 1000

# seamless rotate, prevents rotate stalls if precaching huge datasets
# optional, default is 1
seamless_rotate     = 1

# whether to forcibly preopen all indexes on startup
# optional, default is 1 (preopen everything)
preopen_indexes     = 0

# whether to unlink .old index copies on succesful rotation.
# optional, default is 1 (do unlink)
unlink_old      = 1

# attribute updates periodic flush timeout, seconds
# updates will be automatically dumped to disk this frequently
# optional, default is 0 (disable periodic flush)
#
# attr_flush_period = 900


# instance-wide ondisk_dict defaults (per-index value take precedence)
# optional, default is 0 (precache all dictionaries in RAM)
#
# ondisk_dict_default   = 1


# MVA updates pool size
# shared between all instances of searchd, disables attr flushes!
# optional, default size is 1M
mva_updates_pool    = 1M

# max allowed network packet size
# limits both query packets from clients, and responses from agents
# optional, default size is 8M
max_packet_size     = 8M

# crash log path
# searchd will (try to) log crashed query to 'crash_log_path.PID' file
# optional, default is empty (do not create crash logs)
#
# crash_log_path        = /usr/local/coreseek/var/log/crash


# max allowed per-query filter count
# optional, default is 256
max_filters     = 256

# max allowed per-filter values count
# optional, default is 4096
max_filter_values   = 4096


# socket listen queue length
# optional, default is 5
#
# listen_backlog        = 5


# per-keyword read buffer size
# optional, default is 256K
#
# read_buffer       = 256K


# unhinted read size (currently used when reading hits)
# optional, default is 32K
#
# read_unhinted     = 32K


# max allowed per-batch query count (aka multi-query count)
# optional, default is 32
max_batch_queries   = 32


# max common subtree document cache size, per-query
# optional, default is 0 (disable subtree optimization)
#
# subtree_docs_cache    = 4M


# max common subtree hit cache size, per-query
# optional, default is 0 (disable subtree optimization)
#
# subtree_hits_cache    = 8M


# multi-processing mode (MPM)
# known values are none, fork, prefork, and threads
# optional, default is fork
#
workers         = threads # for RT to work


# max threads to create for searching local parts of a distributed index
# optional, default is 0, which means disable multi-threaded searching
# should work with all MPMs (ie. does NOT require workers=threads)
#
# dist_threads      = 4


# binlog files path; use empty string to disable binlog
# optional, default is build-time configured data directory
#
# binlog_path       = # disable logging
# binlog_path       = /usr/local/coreseek/var/data # binlog.001 etc will be created there


# binlog flush/sync mode
# 0 means flush and sync every second
# 1 means flush and sync every transaction
# 2 means flush every transaction, sync every second
# optional, default is 2
#
# binlog_flush      = 2


# binlog per-file size limit
# optional, default is 128M, 0 means no limit
#
# binlog_max_log_size   = 256M


# per-thread stack size, only affects workers=threads mode
# optional, default is 64K
#
# thread_stack          = 128K


# per-keyword expansion limit (for dict=keywords prefix searches)
# optional, default is 0 (no limit)
#
# expansion_limit       = 1000


# RT RAM chunks flush period
# optional, default is 0 (no periodic flush)
#
# rt_flush_period       = 900


# query log file format
# optional, known values are plain and sphinxql, default is plain
#
# query_log_format      = sphinxql


# version string returned to MySQL network protocol clients
# optional, default is empty (use Sphinx version)
#
# mysql_version_string  = 5.0.37


# trusted plugin directory
# optional, default is empty (disable UDFs)
#
# plugin_dir            = /usr/local/sphinx/lib


# default server-wide collation
# optional, default is libc_ci
#
# collation_server      = utf8_general_ci


# server-wide locale for libc based collations
# optional, default is C
#
# collation_libc_locale = ru_RU.UTF-8


# threaded server watchdog (only used in workers=threads mode)
# optional, values are 0 and 1, default is 1 (watchdog on)
#
# watchdog              = 1


# SphinxQL compatibility mode (legacy columns and their names)
# optional, default is 0 (SQL compliant syntax and result sets)
#
# compat_sphinxql_magics    = 1

}

--eof--

求救一下 不知道哪里错了 中文搜不出结果来

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
爬虫福利二 之 妹子图网MM批量下载
爬虫福利一:27报网MM批量下载    点击 看了本文,相信大家对爬虫一定会产生强烈的兴趣,激励自己去学习爬虫,在这里提前祝:大家学有所成! 目标网站:妹子图网 环境:Python3.x 相关第三方模块:requests、beautifulsoup4 Re:各位在测试时只需要将代码里的变量 path 指定为你当前系统要保存的路径,使用 python xxx.py 或IDE运行即可。
Java学习的正确打开方式
在博主认为,对于入门级学习java的最佳学习方法莫过于视频+博客+书籍+总结,前三者博主将淋漓尽致地挥毫于这篇博客文章中,至于总结在于个人,实际上越到后面你会发现学习的最好方式就是阅读参考官方文档其次就是国内的书籍,博客次之,这又是一个层次了,这里暂时不提后面再谈。博主将为各位入门java保驾护航,各位只管冲鸭!!!上天是公平的,只要不辜负时间,时间自然不会辜负你。 何谓学习?博主所理解的学习,它
程序员必须掌握的核心算法有哪些?
由于我之前一直强调数据结构以及算法学习的重要性,所以就有一些读者经常问我,数据结构与算法应该要学习到哪个程度呢?,说实话,这个问题我不知道要怎么回答你,主要取决于你想学习到哪些程度,不过针对这个问题,我稍微总结一下我学过的算法知识点,以及我觉得值得学习的算法。这些算法与数据结构的学习大多数是零散的,并没有一本把他们全部覆盖的书籍。下面是我觉得值得学习的一些算法以及数据结构,当然,我也会整理一些看过
Elastic:菜鸟上手指南
您们好,我是Elastic的刘晓国。如果大家想开始学习Elastic的话,那么这里将是你理想的学习园地。在我的博客几乎涵盖了你想学习的许多方面。在这里,我来讲述一下作为一个菜鸟该如何阅读我的这些博客文章。 我们可以按照如下的步骤来学习: 1) Elasticsearch简介:对Elasticsearch做了一个简单的介绍 2) Elasticsearch中的一些重要概念:cluster, n
大学四年自学走来,这些私藏的实用工具/学习网站我贡献出来了
大学四年,看课本是不可能一直看课本的了,对于学习,特别是自学,善于搜索网上的一些资源来辅助,还是非常有必要的,下面我就把这几年私藏的各种资源,网站贡献出来给你们。主要有:电子书搜索、实用工具、在线视频学习网站、非视频学习网站、软件下载、面试/求职必备网站。 注意:文中提到的所有资源,文末我都给你整理好了,你们只管拿去,如果觉得不错,转发、分享就是最大的支持了。 一、PDF搜索网站推荐 对于大部
为啥国人偏爱Mybatis,而老外喜欢Hibernate/JPA呢?
关于SQL和ORM的争论,永远都不会终止,我也一直在思考这个问题。昨天又跟群里的小伙伴进行了一番讨论,感触还是有一些,于是就有了今天这篇文。 声明:本文不会下关于Mybatis和JPA两个持久层框架哪个更好这样的结论。只是摆事实,讲道理,所以,请各位看官勿喷。 一、事件起因 关于Mybatis和JPA孰优孰劣的问题,争论已经很多年了。一直也没有结论,毕竟每个人的喜好和习惯是大不相同的。我也看
Java知识体系最强总结(2020版)
更新于2020-01-05 18:08:00 本人从事Java开发已多年,平时有记录问题解决方案和总结知识点的习惯,整理了一些有关Java的知识体系,这不是最终版,会不定期的更新。也算是记录自己在从事编程工作的成长足迹,通过博客可以促进博主与阅读者的共同进步,结交更多志同道合的朋友。特此分享给大家,本人见识有限,写的博客难免有错误或者疏忽的地方,还望各位大佬指点,在此表示感激不尽。 整理的Ja
计算机专业的书普遍都这么贵,你们都是怎么获取资源的?
介绍几个可以下载编程电子书籍的网站。 1.Github Github上编程书资源很多,你可以根据类型和语言去搜索。推荐几个热门的: free-programming-books-zh_CN:58K 星的GitHub,编程语言、WEB、函数、大数据、操作系统、在线课程、数据库相关书籍应有尽有,共有几百本。 Go语言高级编程:涵盖CGO,Go汇编语言,RPC实现,Protobuf插件实现,Web框架实
卸载 x 雷某度!GitHub 标星 1.5w+,从此我只用这款全能高速下载工具!
作者 | Rocky0429 来源 | Python空间 大家好,我是 Rocky0429,一个喜欢在网上收集各种资源的蒟蒻… 网上资源眼花缭乱,下载的方式也同样千奇百怪,比如 BT 下载,磁力链接,网盘资源等等等等,下个资源可真不容易,不一样的方式要用不同的下载软件,因此某比较有名的 x 雷和某度网盘成了我经常使用的工具。 作为一个没有钱的穷鬼,某度网盘几十 kb 的下载速度让我
复习一周,京东+百度一面,不小心都拿了Offer
你知道的越多,你不知道的越多 点赞再看,养成习惯 本文 GitHub https://github.com/JavaFamily 已收录,有一线大厂面试点思维导图,也整理了很多我的文档,欢迎Star和完善,大家面试可以参照考点复习,希望我们一起有点东西。 前言 还记得我上周说的重庆邮电研二的读者么? 、 知道他拿了Offer之后我也很开心,我就想把它的面试经历和面试题分享出来
毕业5年,我问遍了身边的大佬,总结了他们的学习方法
你知道的越多,你不知道的越多 点赞再看,养成习惯 本文 GitHub https://github.com/JavaFamily 已收录,有一线大厂面试点思维导图,也整理了很多我的文档,欢迎Star和完善,大家面试可以参照考点复习,希望我们一起有点东西。 前言 很多次小伙伴问到学习方法,我也很想写这样的一篇文章来跟大家讨论下关于学习方法这件事情。 其实学习方法这个事情,我没啥发言权
推荐10个堪称神器的学习网站
每天都会收到很多读者的私信,问我:“二哥,有什么推荐的学习网站吗?最近很浮躁,手头的一些网站都看烦了,想看看二哥这里有什么新鲜货。” 今天一早做了个恶梦,梦到被老板辞退了。虽然说在我们公司,只有我辞退老板的份,没有老板辞退我这一说,但是还是被吓得 4 点多都起来了。(主要是因为我掌握着公司所有的核心源码,哈哈哈) 既然 4 点多起来,就得好好利用起来。于是我就挑选了 10 个堪称神器的学习网站,推
这些软件太强了,Windows必装!尤其程序员!
Windows可谓是大多数人的生产力工具,集娱乐办公于一体,虽然在程序员这个群体中都说苹果是信仰,但是大部分不都是从Windows过来的,而且现在依然有很多的程序员用Windows。 所以,今天我就把我私藏的Windows必装的软件分享给大家,如果有一个你没有用过甚至没有听过,那你就赚了......,这可都是提升你幸福感的高效率生产力工具哦! 走起!...... NO、1 ScreenToGif 屏幕,摄像头和
大学四年因为知道了这32个网站,我成了别人眼中的大神!
依稀记得,毕业那天,我们导员发给我毕业证的时候对我说“你可是咱们系的风云人物啊”,哎呀,别提当时多开心啦......,嗯,我们导员是所有导员中最帅的一个,真的...... 不过,导员说的是实话,很多人都叫我大神的,为啥,因为我知道这32个网站啊,你说强不强......,这次是绝对的干货,看好啦,走起来! PS:每个网站都是学计算机混互联网必须知道的,真的牛杯,我就不过多介绍了,大家自行探索,觉得没用的,尽管留言吐槽吧?
看完这篇HTTP,跟面试官扯皮就没问题了
我是一名程序员,我的主要编程语言是 Java,我更是一名 Web 开发人员,所以我必须要了解 HTTP,所以本篇文章就来带你从 HTTP 入门到进阶,看完让你有一种恍然大悟、醍醐灌顶的感觉。 最初在有网络之前,我们的电脑都是单机的,单机系统是孤立的,我还记得 05 年前那会儿家里有个电脑,想打电脑游戏还得两个人在一个电脑上玩儿,及其不方便。我就想为什么家里人不让上网,我的同学 xxx 家里有网,每
史上最全的IDEA快捷键总结
写在前面: 我是 扬帆向海,这个昵称来源于我的名字以及女朋友的名字。我热爱技术、热爱开源、热爱编程。技术是开源的、知识是共享的。 这博客是对自己学习的一点点总结及记录,如果您对 Java、算法 感兴趣,可以关注我的动态,我们一起学习。 用知识改变命运,让我们的家人过上更好的生活。 相关文章: Idea 中最常用的10款插件,提高开发效率 Eclipse 最牛逼的 10 组快捷键,提高开发效率
阿里程序员写了一个新手都写不出的低级bug,被骂惨了。
你知道的越多,你不知道的越多 点赞再看,养成习惯 本文 GitHub https://github.com/JavaFamily 已收录,有一线大厂面试点思维导图,也整理了很多我的文档,欢迎Star和完善,大家面试可以参照考点复习,希望我们一起有点东西。 前前言 为啥今天有个前前言呢? 因为你们的丙丙啊,昨天有牌面了哟,直接被微信官方推荐,知乎推荐,也就仅仅是还行吧(心里乐开花)
一文带你看清 HTTP 所有概念
上一篇文章我们大致讲解了一下 HTTP 的基本特征和使用,大家反响很不错,那么本篇文章我们就来深究一下 HTTP 的特性。我们接着上篇文章没有说完的 HTTP 标头继续来介绍(此篇文章会介绍所有标头的概念,但没有深入底层) HTTP 标头 先来回顾一下 HTTP1.1 标头都有哪几种 HTTP 1.1 的标头主要分为四种,通用标头、实体标头、请求标头、响应标头,现在我们来对这几种标头进行介绍 通用
作为一个程序员,CPU的这些硬核知识你必须会!
CPU对每个程序员来说,是个既熟悉又陌生的东西? 如果你只知道CPU是中央处理器的话,那可能对你并没有什么用,那么作为程序员的我们,必须要搞懂的就是CPU这家伙是如何运行的,尤其要搞懂它里面的寄存器是怎么一回事,因为这将让你从底层明白程序的运行机制。 随我一起,来好好认识下CPU这货吧 把CPU掰开来看 对于CPU来说,我们首先就要搞明白它是怎么回事,也就是它的内部构造,当然,CPU那么牛的一个东
【综合篇】浏览器的工作原理:浏览器幕后揭秘
web(给达达前端加星标,提升前端技能) 了解浏览器是如何工作的,能够让你站在更高的角度去理解前端 浏览器的发展历程的三大路线,第一是应用程序web化,第二是web应用移动化,第三是web操作系统化。是不是有点不直白。 应用程序web化就是随着现在技术的发展,现在越来越多的应用转向了浏览器与服务器,就是B/S架构;web应用移动化,就是在移动设备应用,什么是移动设备呢。 “移动设备:
破14亿,Python分析我国存在哪些人口危机!
2020年1月17日,国家统计局发布了2019年国民经济报告,报告中指出我国人口突破14亿。 猪哥的朋友圈被14亿人口刷屏,但是很多人并没有看到我国复杂的人口问题:老龄化、男女比例失衡、生育率下降、人口红利下降等。 今天我们就来分析一下我们国家的人口数据吧! 更多有趣分析教程,扫描下方二维码关注vx公号「裸睡的猪」 即可查看! 一、背景 1.人口突破14亿 2020年1月17日,国家统计局发布
作为一个程序员,内存和磁盘的这些事情,你不得不知道啊!!!
截止目前,我已经分享了如下几篇文章: 一个程序在计算机中是如何运行的?超级干货!!! 作为一个程序员,CPU的这些硬核知识你必须会! 作为一个程序员,内存的这些硬核知识你必须懂! 这些知识可以说是我们之前都不太重视的基础知识,可能大家在上大学的时候都学习过了,但是嘞,当时由于老师讲解的没那么有趣,又加上这些知识本身就比较枯燥,所以嘞,大家当初几乎等于没学。 再说啦,学习这些,也看不出来有什么用啊!
这个世界上人真的分三六九等,你信吗?
偶然间,在知乎上看到一个问题 一时间,勾起了我深深的回忆。 以前在厂里打过两次工,做过家教,干过辅导班,做过中介。零下几度的晚上,贴过广告,满脸、满手地长冻疮。 再回首那段岁月,虽然苦,但让我学会了坚持和忍耐。让我明白了,在这个世界上,无论环境多么的恶劣,只要心存希望,星星之火,亦可燎原。 下文是原回答,希望能对你能有所启发。 如果我说,这个世界上人真的分三六九等,...
B 站上有哪些很好的学习资源?
哇说起B站,在小九眼里就是宝藏般的存在,放年假宅在家时一天刷6、7个小时不在话下,更别提今年的跨年晚会,我简直是跪着看完的!! 最早大家聚在在B站是为了追番,再后来我在上面刷欧美新歌和漂亮小姐姐的舞蹈视频,最近两年我和周围的朋友们已经把B站当作学习教室了,而且学习成本还免费,真是个励志的好平台ヽ(.◕ฺˇд ˇ◕ฺ;)ノ 下面我们就来盘点一下B站上优质的学习资源: 综合类 Oeasy: 综合...
死磕Lambda表达式(二):Lambda的使用
在哪使用Lambda表达式?怎么样正确的使用Lambda表达式?
史上最牛逼的 Eclipse 快捷键,提高开发效率!
如果你在使用IDEA,请参考博主另外的一篇idea快捷键的博客。
在三线城市工作爽吗?
我是一名程序员,从正值青春年华的 24 岁回到三线城市洛阳工作,至今已经 6 年有余。一不小心又暴露了自己的实际年龄,但老读者都知道,我驻颜有术,上次去看房子,业务员肯定地说:“小哥肯定比我小,我今年还不到 24。”我只好强颜欢笑:“你说得对。” 从我拥有记忆到现在进入而立之年,我觉得,我做过最明智的选择有下面三个: 1)高中三年,和一位女同学保持着算不上朋友的冷淡关系;大学半年,把这位女同学追到...
CSS操作之你不得不知的一些小技巧(一)ヾ(Ő∀Ő๑)ノ太棒了!!
目录 CSS单行/多行文本,超出隐藏并显示省略号 1. CSS单行/多行文本,超出隐藏并显示省略号 方法一:使用CSS属性 单行文本溢出显示省略号 width: 100px; overflow: hidden; text-overflow:ellipsis; //文本溢出显示省略号 white-space: nowrap; //文本不会换...
强烈推荐 10 款珍藏的 Chrome 浏览器插件
Firebug 的年代,我是火狐(Mozilla Firefox)浏览器的死忠;但后来不知道为什么,该插件停止了开发,导致我不得不寻求一个新的网页开发工具。那段时间,不少人开始推荐 Chrome 浏览器,我想那就试试吧,期初我觉得用起来很别扭,毕竟我不是一个“喜新厌旧”的人。但用的次数越来越多,也就习惯了。 Chrome 浏览器有一个好处,就是插件极其丰富,只有你想不到的,没有你找不到的,这恐怕是...
我以为我对数据库索引十分了解,直到我遇到了阿里面试官。
索引的数据结构分析,数据库面试到索引最常见的问题分析,我总结了一下。
Java程序员都需要懂的「反射」
前言 只有光头才能变强。 文本已收录至我的GitHub精选文章,欢迎Star:https://github.com/ZhongFuCheng3y/3y 今天来简单写一下Java的反射。本来没打算写反射这个知识点的,只是不少的读者都问过我:“你的知识点好像缺了反射阿。能不能补一下?” 这周末也有点空了,所以来写写我对反射的简单理解。这篇是入门文章,没有高深的知识点,希望能对新人有帮助。如果...
史上最牛逼的 VSCode 插件,提高开发效率!
这篇文章收集了一些常用的vscode插件,提高开发效率。
Java第二周学习
Java第二周学习 1. 数组 1.1 定义数组格式 数据类型[] 数组名 = new 数据类型[容量]; int[] arr = new int[10]; 赋值左侧 数据类型: 告知编译器,当前数组中能够保存的数据类型到底是什么?并且在确定数据类型之后,整个数组中保存的数据类型无法修改!!! []: 告知编译器这里定义的是一个数组类型数据。 明确告知编译器,数组名是一个【引用数据类型...
有没有简单一点的 Python 小例子或小项目?
分享一波Github上适合新手入门、又十分有趣的Python项目~ 1. 人脸识别 star:30.5k 最简洁的人脸识别库。可以使用Python和命令行工具提取、识别、操作人脸。其人脸识别是基于业内领先的C++开源库dlib中的深度学习模型,用Labeled Faces in the Wild人脸数据集进行测试,准确率高达99.38%。 而且有中文版README哟~ 2. faceai sta...
看完这篇JVM,阿里面试官都不怕!
前言 只有光头才能变强 本已收录至我的GitHub精选文章,欢迎Star:https://github.com/ZhongFuCheng3y/3y 学习JVM的目的也很简单: 能够知道JVM是什么,为我们干了什么,具体是怎么干的。能够理解到一些初学时不懂的东西 在面试的时候有谈资 能装逼 (图片来源:https://zhuanlan.zhihu.com/p/25511795,侵删) 声...
隆重向你推荐这 8 个开源 Java 类库
昨天在青铜时代群里看到读者朋友们在讨论 Java 最常用的工具类,我觉得大家推荐的确实都挺常见的,我自己用的频率也蛮高的。恰好我在 programcreek 上看到过一篇类似的文章,就想着梳理一下分享给大家。 在 Java 中,工具类通常用来定义一组执行通用操作的方法。本篇文章将会向大家展示 8 个工具类以及它们最常用的方法,类的排名和方法的排名均来自可靠的数据,从 GitHub 上最受欢迎的 ...
Java基础知识面试题(2020最新版)
文章目录Java概述何为编程什么是Javajdk1.5之后的三大版本JVM、JRE和JDK的关系什么是跨平台性?原理是什么Java语言有哪些特点什么是字节码?采用字节码的最大好处是什么什么是Java程序的主类?应用程序和小程序的主类有何不同?Java应用程序与小程序之间有那些差别?Java和C++的区别Oracle JDK 和 OpenJDK 的对比基础语法数据类型Java有哪些数据类型switc...
Spring面试题(2020最新版)
文章目录Spring概述(10)什么是spring?Spring框架的设计目标,设计理念,和核心是什么Spring的优缺点是什么?Spring有哪些应用场景Spring由哪些模块组成?Spring 框架中都用到了哪些设计模式?详细讲解一下核心容器(spring context应用上下文) 模块Spring框架中有哪些不同类型的事件Spring 应用程序有哪些不同组件?使用 Spring 有哪些方式...
用树莓派做一个人脸识别开锁应用
作者:eckygao,腾讯 CSIG 云产品部1.案例概述1.1 背景实现一个人脸识别进行开锁的功能,用在他的真人实景游戏业务中。总的来说,需求描述简单,但由于约束比较多,在架构与选型上...
C语言写个贪吃蛇游戏
贪吃蛇是个非常经典的游戏,用C语言来实现也是一个好玩的事情。这个游戏我写完后放在知乎,竟然点赞的人数超级多。我觉得大家喜欢,一个方面是因为写得简单,大家都能看得懂,一个可扩展性还是非常强...
出不了门的日子,我选择在 GitHub 上快乐的打游戏
作者 | Rocky0429 来源 | Python空间 大家好,我是 Rocky0429,一个在家憋到长蘑菇的蒟蒻… 2020 年的开年因为一些大家都知道的原因,有些不顺,但还是要捏捏自己的脸蛋儿,微笑的面对,毕竟日子还是要过下去… 要点脸皮,不能出门,假期又一延再延,作为一个从小熟读结发悬梁铁锥刺骨囊萤照读牛角挂书等典故的社会主义好青年,我决定趁这段时间好好充实自己,争取早日上...
7年加工作经验的程序员,从大厂跳槽出来,遭遇了什么?
引言      很久没写文章了,只是隔一两个月更新篇小说,回想起来,LZ至今工作也8年了,回想起来,一时间难免感慨,时间真的过的太快了。   当初在北京的4年多,是LZ工作中最精彩的一段经历,这也是为何LZ的小说以LZ在北京打拼时的真实经历为背景,因为那是一段难忘而又精彩的时光。   16年偶得一个大厂的offer,因此LZ就毅然决然的来到了杭州,来到杭州以后,LZ的工作平淡了许多,或许和...
为什么大多数人永远不会真正成功?
前几天看到一个叫做《为什么大多数人永远不会真正成功?》的视频,我本来以为是鸡汤,耐着性子看了一个开头,立刻被吸引了,居然一口气看完了。看完了以后,我对照着自己这10多年的经历反思了一下...
一篇文章带你入门爬虫丶刷网课丶刷文章阅读量丶刷刷刷。
走过路过不要错过,学不会没关系,长点见识也是可以的啦。 简介 博主于17年开始自学的python, 期间做过各个领域的python开发,包括爬虫, web, 硬件, 桌面应用, AI, 数据分析。 可能有人会问python能做硬件开发?可自行搜索pyboard丶树莓派丶MicroPython, 描述python最有精髓的一句话: python 除了不能生孩子, 啥都能干。 通过该篇文章,读者可以...
Python3怎么处理Excel中的数据(xlrd、xlwt的使用方法)
最近在做毕设,需要把Excel中的数据进行处理,但是。。有346469条数据,不能每一条都自己进行运算并且将它进行归一化运算
python --图像处理基础
一、PIL-Python图像库 二、 三、
疫情期间,天天对着你“开枪”的额温枪,你知道它的工作原理吗?
冠状病毒疫情期间,大家都知道口罩脱销了,消毒酒精脱销了,其实医用的额温枪也脱销了,一枪难求,因为其快速测温(1秒测温),无接触测温的特点,在很多地方被广泛使用,额温枪成了名副其实的防疫物质,此篇博客讲述额温枪的工作原理。
快速傅里叶变换(研二的我终于弄懂了)
这是一篇由快速傅里叶引起的知识惨案,竟然耗费了我好几天时间;不过一想起那只蝙蝠,我就觉得,会耗你就多耗点
认清现实|别再忽悠大学生创业了
大学之道,在明明德,在亲民,在止于至善。知止而后有定,定而后能静,静而后能安,安而后能虑,虑而后能得。物有本末,事有终始,知所先后,则近道矣。
相关热词 c#判断数字不得为负数 c#帧和帧协议 c#算偏移值 c# 在枚举中 c#6 字符串 插值 c#程序中的占位符标签 c#监听数组变化 c# vlc c#索引实现 c# 局域网广播通信
立即提问