Answers for the Japanese Base dictionary question:


There are two types of CJK searches which will be available:


1.)    CJK Natural Search – ナチュラルサーチ -  flexible search – currently available

a.       This includes Hankaku to Zenkaku Katakana normalization and Zenkaku to Hankaku Alphanumeric normalization. (single-byte vs. double-byte normalization)

                                                               i.      User may choose not to normalize i f they don’t want to.

b.      Onbiki Searches (A search for マネージャー returns same results as マネジャ)

                                                &nb sp;              i.      User may choose not to have this feature if they don’t want to.

c.       Thesaurus (Synonym) mappings (these are generally created manually through the use of developer studio or IAP workbench)

                  &nbs p;                                            i.      Another option is to use a batch script to create the thesaurus.xml file

d.      Phrased Search

       & nbsp;                                                       i.      a search for “朝日新聞” will not return “は月曜だったので早く起きようと思ったが、寝坊して新聞を読む暇もなかった。

1.       CJK Natural Search is semi-intelligent, in that it will only return results which contain 朝日新聞

                                                             ii.      < ![endif]>User has the option of changing this to a full wildcarded search if they want, so that the search featured in (i) will return everything. (Customers who use a lot of numeric data searches may prefer this)

e.      Multiple phrased search

                                           &nb sp;                   i.      User has the option to submit multiple phrases in one query. They can submit 朝日(space)新聞 this will submit “朝日” and “新聞” as independent phrases so that all results containing “新聞” will be returned as well as all results containing "朝日

f.< span style='font:7.0pt "Times New Roman"'>        Stop words

                                                               i.      If you do not want some words to be included in the search index, you can add these to stop words. For example: Sonyプレステ、任天堂DS

1.       Adding の as a stop word will allow users to be able to submit shortcut searches: Sonyプレステ or 任天堂DS

a.       Stopwords may be useful for part numbers and merchan dise

b.      Other stopwords include: は に が  を


2.)    CJK Linguistics Search – 言語サーチ

a.       This is currently in Beta phase and will be available around end of Q2

b.      Should contain the following features:

                                                    & nbsp;          i.      Customizable dictionary

                                                             ii.      Spell correction

                                                            iii.      Stemming

                                                            iv.      base dictionary – base dictionary is created dynamically and dependent on the data (unknown – how many original words are contained)

                                                 & nbsp;           v.      half-width and full-width normalization