I have the following task.
I need to create a console application which takes one param which is the number of data to generate. The data is person address and name. I create a table adress with state, city, zip-code fields. I also create a table with first and last name columns. I use HugSQL to deal with PostgreSQL. So I want to dynamically mix addresses, first and last name and print such the result into the console, the number of generated values depends on the argument passed to the application. This is my code:
(ns project.core
(:require
[project.db.get :as get]))
(defn parse-int [s]
(Integer. (re-find #"\d+" s )))
(def usa-data (get/usa))
(defn usa-adress-getter []
(let [data (into {} (shuffle usa-data))
city (get data :city)
state (get data :state)
zip (get data :zip_code)]
(str state " " city " " zip)))
(defn repeater [times]
(dotimes [i times]
(println (usa-adress-getter))))
(defn -main [value]
(repeater (parse-int value)))
Here I just check the result of usa-adress-getter function. But the time of the evaluation of function is too big, i have limit which is 1 million values in 1 minutes. How to increase the speed of the evaluation?
Function (get/usa) retrieve all data from adress table.
usa-address-getterlooks strange. does it even work properly? It really shouldn't, because you shadowclojure.core/getwithproject.db/get. Please check the code(into {} (shuffle usa-data))looks suspicious, since usa-data should return a sequence of records, so adding it to the map looks like a nonsence. Maybe it should be(into {} (first (shuffle usa-data)))? Anyhow thos seebs to be the key to low performance: you eagerly shuffle million items on every iteration and it is really slow (about 250ms on my machine). I would advice you to go withrand-nth:(into {} (rand-nth usa-data))(clojure.pprint/pprint (repeatedly (parse-int value) usa-address-getter))and throw therepeaterfuntion awayget.clj. This file contains path to SQL and function(defn usa [] (all-usa db/dbspec))usa-dataexample, and desired output