5
\$\begingroup\$

I am reading Hacker's guide to Neural Networks. Since I am also learning Clojure, I tried to implement them in Clojure.

I would like the feedback about what could be more idiomatic and better in the below code. Also I have some problems which i have asked at the end.

Here I have simple gates like Multiply that take inputs from other gates and produce output. The circuit is made with the composition of these gates.

Two functions forward and backward are defined. forward computes the output value of the circuit when we start from some input value. backward computes the gradient or the pull that should be applied on the inputs to make the output value more positive.

backward uses forward for its calculation. Using backward we get some values (pulls) which after applying on the input would produce output value greater than before.

;; A single unit has forward value and backward gradient.
(defrecord Unit 
    [value gradient name]
  Object
  (toString [_]
    (str name " : ")))

;; A Gate has two units of inputs. 
(defrecord Gate
    [^:Unit input-a ^:Unit input-b])

(defprotocol GateOps
  "Basic Gate Operations: Forward and Backward are two 
  protocol-operations need to be supported by  each gate."
  (forward [this] "Give the output-value from its input - used in 
going forward the circuit. ")
  (backward [this back-grad] "Gives the gradient to its input -
argument has back-grad which is gradient from its output. Backward calcuates the
derivative for generating the backward pull."))

;; Unit is basic unit of cirtuit and hence simple operations.
(extend-protocol GateOps
  Unit
  (forward [this]
    (:value this))
  (backward [this back-grad]
    {this back-grad}))

;; MultiplyGate gets two inputs and * their values going forward.
(defrecord MultiplyGate [input-a input-b]
  GateOps
  (forward [this]
    (* (forward (:input-a this)) (forward (:input-b this))))

  (backward [this back-grad]
    (let [input-a (:input-a this)
          input-b (:input-b this)
          val-a (forward input-a)
          val-b (forward input-b)]
      (merge-with + (backward input-a (* val-b back-grad))
                  (backward input-b (* val-a back-grad))))))

;; AddGate add values of two  inputs.
(defrecord AddGate [input-a input-b]
  GateOps
  (forward [this]
    (+ (forward (:input-a this)) (forward (:input-b this))))

  (backward [this back-grad]
    (let [input-a (:input-a this)
          input-b (:input-b this)]
      (merge-with + (backward input-a (* 1.0 back-grad))
                  (backward input-b (* 1.0 back-grad))))))


(defn sig 
  "Sigmoid function : f(x) = 1/(1 + exp(-x))"
  [x]
  (/ 1 (+ 1 (Math/pow Math/E (- x)))))

;; SigmoidGate applies sig on input.
(defrecord SigmoidGate [gate]
  GateOps
  (forward [this]
    (sig (forward (:gate this))))

  (backward [this back-grad]
    (let [s (forward this) 
          ;; s is (sig input) i.e. output
          ds (* s (- 1 s) back-grad)]
      (backward (:gate this) ds))))


(defmacro defunit 
  "Creates a Unit that also stores the name of the variable."
  [var-name body]
  `(def ~var-name (~@body (name '~var-name))))

;; neural network : f(x,y) = sig(a*x + b*y + c)
(defunit a (->Unit 1.0 0.0))
(defunit b (->Unit 2.0 0.0))
(defunit c (->Unit -3.0 0.0))
(defunit x (->Unit -1.0 0.0))
(defunit y (->Unit 3.0 0.0))

(def ax (->MultiplyGate a x))
(def by (->MultiplyGate b y))
(def axc (->AddGate ax c))
(def axcby (->AddGate axc by))
(def sigaxcby (->SigmoidGate axcby ))

Running:

neurals.core> (forward sigaxcby)
0.8807970779778823      
neurals.core> (clojure.pprint/pprint (backward sigaxcby 1.0))
{{:value 2.0, :gradient 0.0, :name "b"} 0.31498075621051985,    
 {:value 3.0, :gradient 0.0, :name "y"} 0.20998717080701323,    
 {:value -3.0, :gradient 0.0, :name "c"} 0.10499358540350662,    
 {:value -1.0, :gradient 0.0, :name "x"} 0.10499358540350662,
 {:value 1.0, :gradient 0.0, :name "a"} -0.10499358540350662}

nil

Queries:

  1. Backward is inefficient. It calculates the forward on same sub-circuit multiple times.

  2. Once I know the gradients, how do I apply them to the circuit's input values because they are immutable?

I think these problems show that there is a design flaw.

Any suggestions?

\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

When it finally drew to me, it was a Ahaa!!! moment.

I think that the solution to both my problems lies in removing state - which is the essence of Functional Programming..

Rather than storing value and gradient in my Units, i just let them be the left-most-end-points of the circuit. The circuit as a whole now is just a function. The circuit is supplied with initial input-values. When Unit is asked about its forward, it returns the value provided in the input-data flowing around the circuit.

;; A single unit is the input-to-the-circuit.
;; It is identified by its `id`.
(defrecord Unit 
    [id name]
  Object
  (toString [_]
    (str name " : ")))

;; A Gate has two units of inputs. 
(defrecord Gate
    [^:Unit input-a ^:Unit input-b])

(defprotocol GateOps
  "Basic Gate Operations: Forward and Backward are two 
  protocol-operations need to be supported by  each gate."
  (forward [this _] "Give the output-value from input gate(s) used in 
going forward the circuit. ")
  (backward [this back-grad init-values] "Gives the gradient to its 
input - argument(s) are back-grad : which is gradient from its output. 
init-values : stores the input of units given to the circuit. It is 
propagated around the circuit. Backward calcuates the derivative for 
generating the backward pull."))

;; Unit is mouth of cirtuit and hence simple operations.
(extend-protocol GateOps
  Unit
  (forward [this input-values]
    (let [id (:id this)]
      {id (input-values id)}))
  (backward [this back-grad init-values]
    {(:id this) back-grad}))

;; MultiplyGate gets two inputs and * their values going forward.
(defrecord MultiplyGate [id input-a input-b]
  GateOps
  (forward [this input-values]
      (let [id (:id this)
        input-a (:input-a this)
        input-b (:input-b this)
        val-a-map (forward input-a input-values)
        val-b-map (forward input-b input-values)]
    (-> 
     (merge-with + val-a-map val-b-map)
     (assoc   id (* (val-a-map (:id input-a)) 
                    (val-b-map (:id input-b)))))))

  (backward [this back-grad init-values]
    (let [input-a (:input-a this)
          input-b (:input-b this)
          val-a (init-values (:id input-a))
          val-b (init-values (:id input-b))]
      (-> 
       (merge-with + (backward input-a (* val-b back-grad) init-values)
                   (backward input-b (* val-a back-grad) init-values))
       (assoc  (:id this) back-grad)))))

;; AddGate add values of two  inputs.
(defrecord AddGate [id input-a input-b]
  GateOps
  (forward [this input-values]
      (let [id (:id this)
        input-a (:input-a this)
        input-b (:input-b this)
        val-a-map (forward input-a input-values)
        val-b-map (forward input-b input-values)]
    (-> 
     (merge-with + val-a-map val-b-map)
     (assoc   id (+ (val-a-map (:id input-a)) 
                    (val-b-map (:id input-b)))))))

  (backward [this back-grad init-values]
    (let [input-a (:input-a this)
          input-b (:input-b this)]
      (-> 
       (merge-with + (backward input-a (* 1.0 back-grad) init-values)
                   (backward input-b (* 1.0 back-grad) init-values))
       (assoc   (:id this) back-grad)))))


(defn sig 
  "Sigmoid function : f(x) = 1/(1 + exp(-x))"
  [x]
  (/ 1 (+ 1 (Math/pow Math/E (- x)))))

;; SigmoidGate applies sig on input.
(defrecord SigmoidGate [id gate]
  GateOps
  (forward [this input-values]
      (let [id (:id this)
        input-a (:gate this)
        val-a-map (forward input-a input-values)]
    (-> 
     val-a-map
     (assoc   id (sig (val-a-map (:id input-a)))))))

  (backward [this back-grad init-values]
    (let [s (init-values (:id this))
          ;; s is (sig input) i.e. output
          ds (* s (- 1 s) back-grad)]
      (->
       (backward (:gate this) ds init-values)
       (assoc (:id this) back-grad)))))


(defmacro defunit 
  "Creates a Unit that also stores the name of the variable."
  [var-name body]
  `(def ~var-name (~@body (name '~var-name))))

;; neural network : f(x,y) = sig(a*x + b*y + c)
(defunit a (->Unit 0))
(defunit b (->Unit 1))
(defunit c (->Unit 2))
(defunit x (->Unit 3))
(defunit y (->Unit 4))

(def ax (->MultiplyGate 5 a x))
(def by (->MultiplyGate 6 b y))
(def axc (->AddGate 7 ax c))
(def axcby (->AddGate 8 axc by))
(def sigaxcby (->SigmoidGate 9 axcby ))

(clojure.pprint/pprint 
 (backward sigaxcby 1.0 
           ;; forward takes a map { unit-id  input-to-unit }
           (forward sigaxcby {0 1.0
                              1 2.0
                              2 -3.0
                              3 -1.0
                              4 3.0
                              })))

Terminal View:

neurals.core>           (clojure.pprint/pprint (forward sigaxcby {0 1.0
                              1 2.0
                              2 -3.0
                              3 -1.0
                              4 3.0
                              }))
{0 1.0,
 7 -4.0,
 1 2.0,
 4 3.0,
 6 6.0,
 3 -1.0,
 2 -3.0,
 9 0.8807970779778823,
 5 -1.0,
 8 2.0}
nil
neurals.core> (clojure.pprint/pprint 
                  (backward sigaxcby 1.0 
                    ;; forward takes a map { unit-id  input-to-unit }
                     (forward sigaxcby {0 1.0
                              1 2.0
                              2 -3.0
                              3 -1.0
                              4 3.0
                              })))

{0 -0.10499358540350662,
 7 0.10499358540350662,
 1 0.31498075621051985,
 4 0.20998717080701323,
 6 0.10499358540350662,
 3 0.10499358540350662,
 2 0.10499358540350662,
 9 1.0,
 5 0.10499358540350662,
 8 0.10499358540350662} 
nil

This looks good to me.. :)

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.