Clojure China

一千以下中文数字识别

#1

大家好!我正在做的项目里需要识别一千以下用中文表达的数字,如“八百零九”。

因为没有需要,我暂时回避了“三百三”这种简略写法的识别,而只认“三百三十”。

下面是我目前的代码,欢迎批评!谢谢@_@

(defn chinese-number
  {:test
   #(let [f chinese-number]
      (tt/comprehend-tests
       (t/is (= 0   (f "零")))
       (t/is (= 1   (f "一")))
       (t/is (= 10  (f "十")))
       (t/is (= 12  (f "十二")))
       (t/is (= 20  (f "二十行")))
       (t/is (= 34  (f "三十四")))
       (t/is (= 567 (f "五百六十七")))
       (t/is (= 809 (f "八百零九回")))))}
  [s]
  (let [nmap {\十 10 \百 100}
        dmap {\零 0 \一 1 \二 2 \三 3 \四 4 \五 5 \六 6 \七 7 \八 8 \九 9}]
    (let [c (first s)]
      (cond (empty? s) 0
            (= c \十)  (+ 10 (chinese-number (rest s)))
            (= c \零)  (chinese-number (rest s))
            (contains?
             dmap c)   (let [d (dmap c)
                             n (second s)]
                         (if (contains? nmap n)
                           (+ (* d (nmap n)) (chinese-number (nthrest s 2)))
                           d))
            :else      0))))

注:其中 tclojure.testtt/comprehend-tests 是我自己写的一个检查多个真值,有一个为否就报错的宏。

  1. 给“二十行”,抛出 NullPointerException;于是给 cond 加了 :else 分支。
  2. 代码过长,导致需要用滚动条才能看完;于是把 cond 每个分支 test expr 写在同一行了。
1赞
#2

另一种思路,直接计算两个元素。代码如下:

(def zh-arab {"零" 0 "一" 1 "二" 2 "三" 3 "四" 4 "五" 5
              "六" 6 "七" 7 "八" 8 "九" 9 "十" 10 "百" 100})

;清理数据
(defn clear-data [s]
  (remove nil? (map zh-arab (clojure.string/split s #""))))

;两个元素求和
(defn sum-two [s]
  (cond
    (or (= 10 (last s))
        (= 100 (last s))) (reduce * (first s) (next s))
    :else (reduce + (first s) (next s))))

(defn chinese-number [s]
  (loop [result 0
         element (clear-data s)]
    (if (empty? element)
      result
      (recur (+ result (sum-two (take 2 element))) (drop 2 element))
      )))

;简单的test
(= 0   (chinese-number "零"))
(= 1   (chinese-number "一"))
(= 10  (chinese-number "十"))
(= 12  (chinese-number "十二"))
(= 20  (chinese-number "二十行"))
(= 34  (chinese-number "三十四"))
(= 567 (chinese-number "五百六十七"))
(= 809 (chinese-number "八百零九回"))

如有问题,欢迎指出。

1赞
#3

你好 bugeb,谢谢提出另一种做法!

我理解了你的代码,应该是正确的:)

另外,你的 clear-data 函数很有意思 ;~)

我试着把问题拓展一点,处理更大的数:一千零六十七。这个中间的“零”就需要特殊处理了(当然,需要把 "千" 1000 先加到 zh-arab 里)。

俗话说读代码的最好方式是尝试重写,下面是我在你代码的基础上重写的一部分:

;两个元素求和
(defn sum-two [s]
  (apply (if (#{10 100} (last s))
           *
           +) s))

(defn chinese-number [s]
  (let [element (clear-data s)]
    (reduce
     (fn [ret x]
       (+ ret (sum-two x)))
     0 (partition 2 2 [0] element))))

再次感谢 :thumbsup:

#4

:slightly_smiling: