Donnerstag, 31. Mai 2012

auto-declare in clojure

Automatic forward-declaration in clojure


Table of Contents

  • 1 Auto-declare in clojure
    • 1.1 The reason why clojure resolves symbols at compile time
    • 1.2 a with-auto-declare macro
    • 1.3 example of the use of with-auto-declare

1 Automatic forward-declaration in clojure

There is one think that distracts me everytime I write a lot clojure code: the lack of forward-declarations. Actually,clojure supports forward declarations,but you have to use the declare macro and I hate to stop working on my function to add a declare on top of it. When I return to the function, I have often forgot what I wanted to write … I found a few discussions on this topic here, here and here. This is what Rich Hickey answered to the question of why clojure resolves symbols at compile time (the reason why you have to declare all the vars you have not yet defined).

1.1 The reason why clojure resolves symbols at compile time

3.  Rich Hickey       
View profile  
 More options Nov 11 2008, 4:16 pm
On Nov 11, 12:24 am, Paul Barry <pauljbar...@gmail.com> wrote: 

- Show quoted text -
I can't speak for Scheme, but CL resolves symbols earlier than does 
Clojure - at read time. This causes a number of problems and a lot of 
complexity [1]. 
In Clojure, I had to solve the Lisp-1 vs defmacro problem, and did so 
by separating symbols from vars. That means that symbols are simple 
when read, and only get resolved when compiled. Were compilation to 
auto-create vars for never before seen symbols, many of the problems 
of packages would remain, so I decided that vars and other name 
resolutions needed to be created intentionally, via def/import/refer. 
This leads to many fewer errors, and avoids the problems identified in 
[1]. There is now a declare macro that makes it simple and obvious to 
intentionally claim names for later use: 
(declare a b c) 
; ...def a b c in any order 
This expands into (def a) (def b) (def c). Note that no initial values 
are supplied, so they are unbound, and you will get a runtime error if 
you use them before subsequently defining them, unlike your (def a 
nil) above, which I don't recommend. 
So, you don't need to define things in any specific order - use 
declare. 
Rich 
So the resolution of vars at compile time is good for clojure's namespaces, which are very well thought out in comparison to other lisps like common lisp. Also, it helps to avoid typos. When you spelled the name of a function wrong and clojure would allow forward-declarations, it would assume that you are referring to a var that isn't defined yet and would forward-declare that for you. The error would come up at runtime and (for me, at least) something as triviall as a spelling error is hard to see.
If you look from that position, declare is actually a feature of clojure. But nevertheless I tend to write my clojure-code in a top-down fashion, just like I think about the code. I start with highest level of abstraction at the top of my file and work down to the lower levels of abstraction. This makes the code easier to write and to read for me. Instead of writing new code above my old code or going away from my function and add a declare and go back to my function, everytime I want to have a forward declaration, I wanted to be able to specify symbols to be forward-declared without a need to leave the function and forgetting what I wanted to implement…

1.2 a with-auto-declare macro

So here is my solution: A with-auto-declare macro. With-auto-declare takes a bunch of code, searches for symbols marked to be forward-declared and adds a declare with these symbols on top of the code and gives the code with the auto-declared symbols replaced by the normal symbols. I choose to mark a symbol to be forward-declared by writing a D!- in front of it (please let me know if you have a better alternative). Here's the code:
(ns auto-declare.core)

(defn auto-declare-symbol?
  "Symbols which start with D!- and can not be resolved
   are subject to be auto-declared"
  [symb]
  (and (.startsWith (str symb) "D!-")
       (not (resolve symb))))

(defn auto-declare-symbol-to-normal-symbol [auto-declare-symbol]
  (-> auto-declare-symbol str (.substring 3) symbol))

(refer 'clojure.walk)
(defmacro with-auto-declare [symb & code]
  (let [auto-defs# (->> code flatten (filter auto-declare-symbol?) distinct)
        normal-symbols# (map auto-declare-symbol-to-normal-symbol auto-defs#)
        replacement-map# (-> {} (into (map vector auto-defs# normal-symbols#)))]
    `(do (declare ~@normal-symbols#) ~@(clojure.walk/postwalk-replace  replacement-map# code))))
Flatten extracts all symbols of the code, and distinct removes the duplicates from the auto-declare symbols. I use the function clojure.walk/postwalk-replace to replace all occurences of an auto-declare symbol with the corresponding normal-symbol. Therefor I need an replacement map, with is easily filled with into. The expression (map vector coll1 coll2) just zips the collections together. The 2-element vectors are interpreted as key-value pairs from into.

1.3 example of the use of with-auto-declare

As an example: Here is a piece of code from my blogpost: Porting PAIP to clojure - chapter 2:
(declare sentence noun-phrase verb-phrase article noun verb one-of) 
(defn  sentence [] (concat (noun-phrase) (verb-phrase)))
(defn  noun-phrase [] (concat (article) (noun)))
(defn  verb-phrase [] (concat (verb) (noun-phrase)))
(defn  article [] (one-of ["the" "a"]))
(defn  noun [] (one-of ["man" "ball" "woman" "table"]))
(defn  verb [] (one-of ["hit" "took" "saw" "liked"]))
=> #'auto-declare.core/verb
I had to declare every single function to get the code compiling… Now with my macro, the code becomes:
(with-auto-declare 
  (defn  sentence [] (concat (D!-noun-phrase) (D!-verb-phrase)))
  (defn  noun-phrase [] (concat (D!-article) (D!-noun)))
  (defn  verb-phrase [] (concat (D!-verb) (D!-noun-phrase)))
  (defn  article [] (D!-one-of ["the" "a"]))
  (defn  noun [] (D!-one-of ["man" "ball" "woman" "table"]))
  (defn  verb [] (D!-one-of ["hit" "took" "saw" "liked"])))
=> #'auto-declare.core/verb
It works! With this macro, you have both: the convienience of automatic-forward declarations when wanted, and you don't have the risk that typos are interpreted as forward-declarations. As always I am thankfull of comments/advice etc.

Edit: Ok I admit that the D!- looks somewhat ugly. I used it to prevent that it shadows an already defined symbol accidently. But, because I test if the symbol can be resolve to decide if it is an auto-declare symbol, this isn't neccesary. So I changed the with-auto-declare macro that it takes the prefix as first argument.
(ns auto-declare.core)

  (defn auto-declare-symbol?
    "Symbols which start with D!- and can not be resolved
     are subject to be auto-declared"
    [prefix symb]
    (and (.startsWith (str symb) (str prefix))
         (not (resolve symb))))

  (defn auto-declare-symbol-to-normal-symbol [prefix auto-declare-symbol]
    (-> auto-declare-symbol str (.substring (.length (str prefix))) symbol))

  (refer 'clojure.walk)

  (defmacro with-auto-declare [symb & code]
    (let [auto-defs# (->> code flatten (filter #(auto-declare-symbol? symb %)) distinct)
          normal-symbols# (map #(auto-declare-symbol-to-normal-symbol symb %1) auto-defs#)
          replacement-map# (-> {} (into (map vector auto-defs# normal-symbols#)))]
      `(do (declare ~@normal-symbols#) ~@(clojure.walk/postwalk-replace  replacement-map# code))))
=> #'auto-declare.core/with-auto-declare
The example: Now with a _ as prefix:
(with-auto-declare _
    (defn  sentence [] (concat (_noun-phrase) (_verb-phrase)))
    (defn  noun-phrase [] (concat (_article) (_noun)))
    (defn  verb-phrase [] (concat (_verb) (_noun-phrase)))
    (defn  article [] (_one-of ["the" "a"]))
    (defn  noun [] (_one-of ["man" "ball" "woman" "table"]))
    (defn  verb [] (_one-of ["hit" "took" "saw" "liked"])))
=> #'auto-declare.core/verb
Date: 2012-06-01T12:55+0200
Author: Maik Schünemann
Org version 7.8.09 with Emacs version 23
Validate XHTML 1.0

Mittwoch, 30. Mai 2012


Literate Programming in Clojure


1 Literate programming in clojure using org-mode babel

In this post I'll first show a double setup of org-mode babel-clojure with both inferior-lisp and slime, and then I will show how to use org-babel-clojure for literate programming.

1.1 Literate programming

Who doesn't know this situation: You have to modify someone others code but you have no clue what this code should do. Searching around the (often barely documentation) doesn't help much either, because it is often not clear what intention the author had with the code, or why he implemented it that way. Literate programming does not only help others reading your code, but it helps you as well,because
  • writing down your intention and explaining it makes it easier for you to code it
  • it helps you to be clear about your design decisions
  • writing it down intented to be read by others makes you to automatically write code of better quality
In Literate programming, you mix the explanation of the programm and the source code in one file. The source code can be automatically extracted. This is called tangling. It is also possible to evaluate the code while you are in the document.

1.2 Org-mode-babel

Babel is a part of org-mode that is a full featured literate programming program. Being part of org-mode, it is also able to be exported into a variety of formats including latex and html. I am using it to write my series of 'Porting PAIP to clojure'. Each blog-post is just a .org file which gets exported to html and uploaded on blogger. Here is the setup:

1.2.1 Setting up org-mode-babel with clojure

Fortunately, clojure is a supported language of org-mode, but it's documentation is outdated. So add following to your .emacs file:
(require 'ob)
(require 'ob-tangle)
(add-to-list 'org-babel-tangle-lang-exts '("clojure" . "clj"))

(org-babel-do-load-languages
 'org-babel-load-languages
 '((emacs-lisp . t)
   (clojure . t)))
(require 'ob-clojure)

(defun org-babel-execute:clojure (body params)
  "Evaluate a block of Clojure code with Babel."
  (concat "=> "
          (if (fboundp 'slime-eval)
              (slime-eval `(swank:interactive-eval-region ,body))
           (if (fboundp 'lisp-eval-string)
             (lisp-eval-string body)
              (error "You have to start a clojure repl first!")))))

(setq org-src-fontify-natively t)
(setq org-confirm-babel-evaluate nil)
(setq org-export-babel-evaluate nil)
(setq org-src-window-setup 'current-window)
(setq inferior-lisp-program "lein repl")
(I assume you have a recent org-mode installed) It will work with either inferior lisp using lein repl, which has the advantage that you can run it without a project.clj file, or with M-x clojure-jack-in, which gives you the goddies of slime with clojure.
I have also written a few lines to make the exported html look better and defined a skeleton to quickly insert the first lines in the .org file. If you type C-S-f4, you will be prompted for the title and the content of the skeleton will be inserted.
  (define-skeleton org-skeleton
  "Header info for a emacs-org file."
  "Title: "
  "#+TITLE:" str " \n"
  "#+AUTHOR: Maik Schünemann\n"
  "#+email: maikschuenemann@gmail.com\n"
  "#+STARTUP:showall\n"
  "-----"
 )
(global-set-key [C-S-f4] 'org-skeleton)

(setq org-export-html-style-extra "<style type=\"text/css\">pre {\n    border: 1pt solid #AEBDCC;\n   color:white;\n background-color: #000000;\n     padding: 5pt;\n font-family: courier, monospace;\n        font-size: 90%;\n        overflow:auto;\n  } </style>")

1.2.2 Interacting with clojure code.

Start a repl first. Either via M-x run-lisp or with M-x clojure-jack-in. Note, if you use the former, the result of the code won't get inserted in the .org file after evaluation.
To write some clojure-code, type the following (may be good to define a kbd-macro to speed this up)
#+begin_src clojure :results output :exports both
(def six (+ 1 2 3))
#+end_src
If you place your cursor between the beginsrc and endsrc and type C-c C-c, you will see the following:
#+begin_src clojure :results output :exports both
 (def six (+ 1 2 3))
 #+end_src

#+RESULTS:
: => #'user/six
Note that the code got evaluated and is now defined in the repl. So if you evaluate the following (either in the file, or in the repl, you will see:
#+begin_src clojure :results output :exports both
six
#+end_src 

#+RESULTS:
: => 6
So you loose nothing of the interactive development you love when coding clojure.

1.2.3 Exporting to html

When you are done, you can export the file to html by typing C-c C-e h or C-c C-e b to see the exported html in your standart browser. If you want the source code to be colorized like you see it in clojure-mode, you need to install the htmlize package (via elpa) The clojure code shown above will export to the following:
(def six (+ 1 2 3))
=> #'user/six
Cool, nicely colored clojure code in your html-documentation or your blog-post. So This should be enough to get you started using org-mode-babel with clojure for Literate Programming. It is interactive like writing normal clojure code and it helps you to think more clearly about your code.

1.3 the org-file

Here is the org-file that you see right now:

#+TITLE:Literate Programming in Clojure
#+AUTHOR: Maik Schünemann
#+email: maikschuenemann@gmail.com
#+STARTUP:showall
-----
* Literate programming in clojure using org-mode babel
  In my previous post ... I showed a setup of clojure with org-mode babel using inferior lisp.
  In this post I'll first show a double setup with both inferior-lisp and slime, and then I will
  show how to use org-babel-clojure for literate programming.
** Literate programming
   Who doesn't know this situation:
   You have to modify someone others code but you have no clue what this code should do. Searching
   around the (often barely documentation) doesn't help much either, because it is often not clear
   what intention the author had with the code, or why he implemented it that way.
   [[http://de.wikipedia.org/wiki/Literate_programming][Literate programming]] does not only help others reading your code, but it helps you as well,because
   - writing down your intention and explaining it makes it easier for you to code it
   - it helps you to be clear about your design decisions
   - writing it down intented to be read by others makes you to automatically write code of better quality
   In Literate programming, you mix the explanation of the programm and the source code in one file.
   The source code can be automatically extracted. This is called tangling. It is also possible to evaluate
   the code while you are in the document.
** Org-mode-babel
   [[http://orgmode.org/worg/org-contrib/babel/][Babel]] is a part of [[http://orgmode.org/][org-mode]] that is a full featured literate programming program. Being part of org-mode,
   it is also able to be exported into a variety of formats including latex and html. I am using it to write
   my series of 'Porting PAIP to clojure'. Each blog-post is just a .org file which gets exported to html and
   uploaded on blogger. Here is the setup:
*** Setting up org-mode-babel with clojure
    Fortunately, clojure is a supported language of org-mode, but it's documentation is outdated. So add following
    to your .emacs file:
    #+begin_src elisp
    (require 'ob)
    (require 'ob-tangle)
    (add-to-list 'org-babel-tangle-lang-exts '("clojure" . "clj"))
 
    (org-babel-do-load-languages
     'org-babel-load-languages
     '((emacs-lisp . t)
       (clojure . t)))
    (require 'ob-clojure)
 
    (defun org-babel-execute:clojure (body params)
      "Evaluate a block of Clojure code with Babel."
      (concat "=> "
              (if (fboundp 'slime-eval)
                  (slime-eval `(swank:interactive-eval-region ,body))
               (if (fboundp 'lisp-eval-string)
                 (lisp-eval-string body)
                  (error "You have to start a clojure repl first!")))))
 
    (setq org-src-fontify-natively t)
    (setq org-confirm-babel-evaluate nil)
    (setq org-export-babel-evaluate nil)
    (setq org-src-window-setup 'current-window)
    (setq inferior-lisp-program "lein repl")
    #+end_src
    (I assume you have a recent org-mode installed)
    It will work with either inferior lisp using lein repl, which has the advantage that you can run it without
    a project.clj file, or with M-x clojure-jack-in, which gives you the goddies of slime with clojure.

    I have also written a few lines to make the exported html look better and defined a skeleton to quickly insert
    the first lines in the .org file. If you type C-S-f4, you will be prompted for the title and the content of the
    skeleton will be inserted.
    #+begin_src elisp
    (define-skeleton org-skeleton
    "Header info for a emacs-org file."
    "Title: "
    "#+TITLE:" str " \n"
    "#+AUTHOR: Maik Schünemann\n"
    "#+email: maikschuenemann@gmail.com\n"
    "#+STARTUP:showall\n"
    "-----"
   )
  (global-set-key [C-S-f4] 'org-skeleton)

  (setq org-export-html-style-extra "<style type=\"text/css\">pre {\n    border: 1pt solid #AEBDCC;\n color:white;\n background-color: #000000;\n padding: 5pt;\n font-family: courier, monospace;\n        font-size: 90%;\n        overflow:auto;\n  } </style>")
    #+end_src
 
*** Interacting with clojure code.
    Start a repl first. Either via M-x run-lisp or with M-x clojure-jack-in. Note, if you use the former, the
    result of the code won't get inserted in the .org file after evaluation.

    To write some clojure-code, type the following (may be good to define a kbd-macro to speed this up)
    #+begin_example
    #+begin_src clojure :results output :exports both
    (def six (+ 1 2 3))
    #+end_src
    #+end_example
    If you place your cursor between the begin_src and end_src and type C-c C-c, you will see the following:
    #+begin_example
    #+begin_src clojure :results output :exports both
     (def six (+ 1 2 3))
     #+end_src

    #+RESULTS:
    : => #'user/six
    #+end_example
    Note that the code got evaluated and is now defined in the repl. So if you evaluate the following (either in
    the file, or in the repl, you will see:
    #+begin_example
    #+begin_src clojure :results output :exports both
    six
    #+end_src

    #+RESULTS:
    : => 6
    #+end_example
    So you loose nothing of the interactive development you love when coding clojure.

*** Exporting to html
    When you are done, you can export the file to html by typing C-c C-e h or C-c C-e b to see the exported
    html in your standart browser.
    If you want the source code to be colorized like you see it in clojure-mode, you need to install the
    htmlize package (via elpa)
    The clojure code shown above will export to the following:

    #+begin_src clojure :results output :exports both
     (def six (+ 1 2 3))
     #+end_src

    #+RESULTS:
    : => #'user/six
    Cool, nicely colored clojure code in your html-documentation or your blog-post.
    So This should be enough to get you started using org-mode-babel with clojure for Literate Programming.
    It is interactive like writing normal clojure code and it helps you to think more clearly about your code.

** the org-file
   Here is the org-file that you see right now:
   (leaving this part out to avoid infinite file size...)

Date: 2012-05-30T11:17+0200
Author: Maik Schünemann
Org version 7.8.09 with Emacs version 23
Validate XHTML 1.0

Dienstag, 29. Mai 2012

Porting PAIP to clojure chapter 2

Porting PAIP to clojure chapter 2

Chapter 2 - Generating random Sentences

1 Porting PAIP to clojure - chapter 2 A Simple Lisp Program 

1.1 Introduction

This is the first post of my series porting PAIP to clojure. It is possible to follow my post even if you don't PAIP but I strongly recommend that you get a copy of it. This chapter contains a program to generate random sentences using a grammar for a subset of english. Norvig presents two versions of the program, a straightforward solution using functions to generate random sentences from a grammar Then the weaknesses of this approach are shown and a better solution is developed using a data-driven solution. Afterwards, the advantages of the data-driven approach are discussed.

1.2 Generating random Sentences

1.2.1 The Grammar

Sentence => Noun-Phrase + Verb-Phrase
Noun-Phrase => Article + Noun
 …
 Article => the, a, ….
 …
A rule consists either of two or more non-terminals like Noun-phrase or of a collection of terminals like the, a,… This representation is very natural if you have ever worked with context-free grammars before. The rule Article => the, a, … means: An article is either the or a or … (Article = the | a | …)

1.2.2 The Program

That is the Common Lisp code for the naive solution. It can be found here.
(defun sentence ()    (append (noun-phrase) (verb-phrase)))
(defun noun-phrase () (append (Article) (Noun)))
(defun verb-phrase () (append (Verb) (noun-phrase)))
(defun Article ()     (one-of '(the a)))
(defun Noun ()        (one-of '(man ball woman table)))
(defun Verb ()        (one-of '(hit took saw liked)))
The translation to clojure is straightforward. But I will use a slightly modified version which is, as I think, a little bot more ideomatic in clojure:
(declare sentence noun-phrase verb-phrase article noun verb one-of)

(defn  sentence [] (concat (noun-phrase) (verb-phrase)))
(defn  noun-phrase [] (concat (article) (noun)))
(defn  verb-phrase [] (concat (verb) (noun-phrase)))
(defn  article [] (one-of ["the" "a"]))
(defn  noun [] (one-of ["man" "ball" "woman" "table"]))
(defn  verb [] (one-of ["hit" "took" "saw" "liked"]))
I use vectors instead of lists and strings instead of symbols. here are the helper-functions: one-of picks one element of the collection and returns a single-element collection containing it.
(defn one-of [coll]
  (if (seq coll)
   [(rand-nth coll)]))
To test, let's get 10 random sentences:
(take 10 (repeatedly sentence))   
this evaluates to:
(("a" "ball" "liked" "the" "ball")
("the" "table" "took" "the" "ball")
("the" "man" "took" "a" "man")
("the" "woman" "saw" "a" "man")
("a" "ball" "hit" "the" "ball")
("a" "table" "hit" "the" "man")
("a" "ball" "took" "a" "ball")
("a" "man" "liked" "a" "ball")
("a" "table" "saw" "the" "ball")
("the" "woman" "hit" "a" "ball"))
I believe it is apparent that this is not a good representation for a grammar. The lisp functions are definitly harder to read than the normal grammar syntax shown above. As an example, if you wanted to represent the following rule: Adj* => e,Adj + Adj* (note that e is the empty symbol) you would have to write a function like this:
(declare Adj)
(defn Adj* []
  (if (= (rand-int 2) 0)
    nil
    (concat (Adj) (Adj*))))
There has to be a better way… enter the data-driven solution!

1.3 A data-driven solution

The naive solution is straightforward to implement while making it harder to encode the grammar rules. By using a data-driven approach, it is the other war round. Writing grammar rules becomes easy while implementation may become harder. So let's take a look how the data-driven solution works for this problem.

1.3.1 The representation of the grammar

The first step is to decouple the grammar from the code which processes it. In the book Norvig represents the grammar with the following datastructure:
 (defparameter *simple-grammar*
 '((sentence -> (noun-phrase verb-phrase))
  (noun-phrase -> (Article Noun))
  (verb-phrase -> (Verb noun-phrase))
  (Article -> the a)
  (Noun -> man ball woman table)
   (Verb -> hit took saw liked))
"A grammar for a trivial subset of English.")
Here the grammar rule sentence => noun-phrase + verbphrase gets translated to
(sentence -> (noun-phrase verb-phrase)).
So a list of elements means "produce this nonterminals in sequence".
Multiple nonterminals right of the -> sign mean "choose one of this nonterminals".
Every rule is itself a list and the grammar is a list of rules.
But this representation is not really clojureish?. We can take advantage of clojures literal representation for maps,vectors and sets in addition to lists.
Using these, the meaning of a datastructure- when choosen appropriate- becomes clearer.
Let me give you an example:
What are grammar rules? Rules map a nonterminal to - posible multiple - nonterminals or terminals. Thus it is appropriate to represent them as a map in clojure. I choose to represent the nonterminals as keywords and the values of the map as either one element or a vector of multiple elements.
A vector means: "apply all elements in order".
Norvig represents a choice of nonterminals as simply the nonterminals written after the -> sign.
For me, it was not clear at the beginning that that means "choose one of these nonterminals".
Using a set in clojure, this becomes obvious. Rewritten, the grammar becomes the following:
(def simple-grammar
{:sentence [:noun-phrase :verb-phrase]
 :noun-phrase [:Article :Noun]
 :verb-phrase [:Verb :noun-phrase]
 :Article #{"the" "a"}
 :Noun #{"man" "ball" "woman" "table"}
 :Verb #{"hit" "took" "saw" "liked"}})

(def ^:dynamic *grammar* simple-grammar)
Ok that's for the design part. Now that we have a good representation of our data, the grammar, we have to worry about evaluating it.

1.3.2 Evaluating the grammar

Because I have chosen a different and more ideomatic representation of the grammar in clojure, the code for evaluating the grammar will be different than the code in PAIP. So don't expect a literal translation of the PAIP code.
So, how can we generate a possible sentence:
the function 'generate' will take the startsymbol as argument and retrieves the rule from the grammar.
If there is not a rule for the argument in the grammar, the argument itself is evaluated (thus making it possible to call generate either with the left hand or the right hand side of a rule).
It will recurse on the elements of the rule in sequence, appending the result as it goes along.
When it encounters a set, it will generate one random element of it. it recursively generates the nonterminal. If it encounters a terminal, that is none of the above are true, it just returns a single-element vector of it.
(defn generate [phrase]
    (cond (get *grammar* phrase) (generate (get *grammar* phrase))
          (sequential? phrase) (mapcat generate phrase)
          (set? phrase)  (generate (rand-nth (seq phrase)))
          :else [phrase]))
Amazing how closely the code mimics the description. So this is data-dricen programming.
Decouple the representation of the data from the code which evaluates it.
Let's take a look at a more complicated example and see how the data-driven approach scales. Here's the bigger grammar:

(def bigger-grammar
 {:sentence [:noun-phrase :verb-phrase]
  :noun-phrase #{[:Article :Adj* :Noun :PP*] :Name :Pronoun}
  :verb-phrase [:Verb :noun-phrase :PP*]
  :PP* #{[] [:PP :PP*]}
  :Adj* #{[] [:Adj :Adj*]}
  :PP [:Prep :noun-phrase]
  :Prep #{"to" "in" "by" "with" "on"}
  :Adj #{"big" "little" "blue" "green" "adiabatic"}
  :Article #{"the" "a"}
  :Name #{"Pat" "Kim" "Lee" "Terry" "Robin"}
  :Noun #{"man" "ball" "woman" "table"}
  :Verb #{"hit" "took" "saw" "liked"}
  :Pronoun #{"he" "she" "it" "these" "those" "that"}
 })
(def ^:dynamic *grammar* bigger-grammar)

Let's generate 10 sentences again:

(take 10 (repeatedly #(generate :sentence)))

(("these" "liked" "a" "ball")
 ("a" "ball" "in" "the" "big" "green" "big" "woman" "in" "that" "by" "Terry" "liked" "a" "man")
 ("that" "liked" "that")
 ("Lee" "hit" "it")
 ("Lee" "took" "Lee")
 ("the" "little" "blue" "little" "ball" "in" "Terry" "by" "Robin" "liked" "Pat")
 ("a" "adiabatic" "blue" "blue" "ball" "saw" "the" "man" "in" "Pat" "by" "Lee" "on" "a" "adiabatic" "table"
  "in" "Terry")
 ("Pat" "hit" "Robin" "to" "those")
 ("those" "liked" "the" "woman" "with" "Robin" "with" "these" "in" "the" "table" "to" "Robin" "to" "a"
  "blue" "adiabatic" "ball" "with" "she" "on" "those" "on" "those")
 ("it" "hit" "Kim" "on" "she" "on" "the" "table"))
It works! Enjoy the funny sentences.
But hey, what if we wanted to see how the sentences are generated, to see the parse-tree. Because of the data-driven design, it is easy to implement this. The data doesn't need to be changed, we need only a new evaluation-function.
(defn generate-all [phrase]
    (cond (get *grammar* phrase)  (list phrase (generate-all (get *grammar* phrase)))
          (sequential? phrase) (mapcat generate-all phrase)
          (set? phrase)  (generate-all (rand-nth (seq phrase)))
          :else phrase))
there are only two changes: make a list of phrase and the generated symbols before recursing and return the phrase in the base call instead of a single item vector containing it.
Imaging changing all functions to achieve this instead of just two lines …
(generate-all :sentence)
;=>
    (:sentence 
      (:noun-phrase 
         (:Name "Kim")
       :verb-phrase 
         (:Verb "saw"
          :noun-phrase 
            (:Article "the"
             :Adj* 
               (:Adj "adiabatic" 
                :Adj* ())
             :Noun "ball" 
             :PP* ())
          :PP* ())))
Norvig gives a last example of a generate-all function, which works on the simple grammar and returns all possible sentences defined by the grammar (the language of the grammar). I leaf the implementation to the reader :)

1.4 Advantages of the data-driven solution

Gratulations for making it through the post. With this chapter, Norvig makes a strong point which will be even more important in the next chapters (and is for programming in general). If you use the data-driven approach, you use "the most natural notation available to solve the problem". So instead of worrying how to implement the problem, worry about how to represent you data, so that it is easy to understand and to scale it. With the data-driven solution, you can
  • expand and modify the program easier
  • use different datasets with the same evaluation function
  • use different evaluation functions with the same dataset
  • represent your problem so that it is easier to understand.
If there is something you take away from this post, it is this:
Use the best notation available to solve your problem!


Chapter 4 coming soon. It contains an implementation of the GPS, the General Problem Solver.
Date: 2012-05-28T18:12+0200
Author: kima
Org version 7.8.09 with Emacs version 23
Validate XHTML 1.0

Sonntag, 27. Mai 2012

Porting PAIP to clojure - Introduction


What is PAIP?

PAIP stands for Paradigms of Artificial Intelligence Programming and is a book written by Peter Norvig.
It is wiedly seen as one of the best programming books and it is definitly the book to read if you are interesting in lisp and AI - like I am. 
The best of this book is that the programs are really developed. Norvig doesn't give the complete and bug-free implementation of the examples, but he shows the process of writing the programs: starting from a simple description to a minimal implementation - then showing limitations of the programs and how to resolve them. This makes the book a great programming book in general.

Porting PAIP to clojure 

After reading the first few chapters I decided to give me a challenge to translate the examples in Common Lisp to clojure. After reading the first few chapters I decided to give me a challenge to translate the examples in Common Lisp to clojure. 
I will not copy the whole book, but I will give a general overview of the chapters and the process of iterative refinement of the implementations of the examples.
It will be possible for you to follow the examples even if you don't have a copy of PAIP but I highly recommend that you get one. This book is worth it's money.

The Book is divided in 5 Parts:

1. Introduction to Common Lisp: 

In this part Norvig gives a short introduction to Common Lisp and highlights the advantages of common lisp. I will only translate the 2nd chapter: a simple lisp program to clojure.

2. Early AI programs

This part contains implementations of early AI programs like GPS-the general problem solver (wich is actually not so general...), ELIZA and STUDENT.

3. Tools and Techniques

This part contains topics like low-level efficiency Issues. This part will be interesting to port to the more functional clojure. It also covers Logic Programming,OOP and Knowledge Representation and Reasoning.

4. Advanced AI Programs

Norvig shows in this part examples of Advanced AI programs, like Expert Systems, Constraint Satisfaction, Natural Language Procesing and Unification Grammars. It ends with a Chapter about developing a Grammar for English.

5. The Rest of Lisp

In this part Scheme is covered. And a interpreter and a complete compiler for Scheme is implemented (with tail-call-recursion and call-cc). It contains also a very helpful chapter about Troubleshooting.

Tools

I plan to write the blog posts as Literate Programming files with org-mode + babel which has support for clojure. This has the advantage, that i can test my code on the go, extract the source-code from the document automagically and export the document to html which I post here. I am also planning to get an github account, so that i can make the files publicy available, so 
stay tuned.

Introducing my new blog

Hello to all.
After I finished my exams I have finally enough time to dig deeper into programming in general and in clojure and AI in specific and to post about my discoveries here.
 Thus my blog has the title: Exploring the programming world. Whenever I come across something interesting, I will post it here.
Clojure is a great language - a modern and functional dialect of lisp.
I enjoy especially the interactive development with clojure.
Everytime I start my PC, I start Emacs and fire up a clojure-repl.

I will start blogging about porting PAIP to clojure.