Tag Archives: haskell

Simon Already Did It!

Science and industry are bubbling with exciting new computational techniques. Parallel processing, quantum computing, GPU processing, software transactional memory. Proof-of-concept programming languages are mushrooming, but Simon already did it.

Simon Peyton Jones’ crazy academic language Haskell can do all of these things. The Hackage repository has packages for parallel processing, quantum computing, GPU processing, and software transactional memory. If there’s a buzz on some new computational technique, Haskell already did it. It’s an incredibly flexible prototyping system.

How does Haskell do all of these things when typically, each of these techniques is explored using an extremely limited prototype language? Haskell is a functional language with powerful domain specific language capability; you no longer need to trade research papers in a dozen languages when it’s easy to prototype an experimental language within Haskell.

DSLs are featured in Lisp; what makes Haskell so special? The secret sauce is declarative programming: In Haskell, the coder describes the computation to be performed, and the Haskell compiler decides an optimal way to perform the computation. Imperative programming is eschewed in favor of pure functional programming: mapping, composition, recursion. By using single assignment and avoiding state, Haskell gains flexibility in how it executes code.

Haskell does event driven computing, concurrent computing, matrix computing, even lazy computing, by isolating imperative code in monads. There are monads for pseudorandom number generation, monads for I/O, monads for multithreading. The accelerate package for behind-the-scenes GPU computing is just a monad that turns Haskell vector manipulations into CUDA, executes it on a GPU, and returns the Haskell-typed result.

The final ingredient is generalized abstract data types. When you create a data type to model something in the real world (employee IDs, aquarium fish, highway traffic, etc.), Haskell can derive how to print and parse the models from text files to live objects and back. It can determine on its own how to compare and sort sequences of data. Finally, Haskell can automatically manipulate arbitrarily complex collections of data (databases of employee IDs, store rooms of aquarium fish, graphs of graphs of graphs of highway traffic). A bloom filter of schoolmates? No problem.

There is no longer a need to learn Unified Parallel C, QCL, CUDA, libev, or MATLAB. Haskell already did it; just learn one language that can handle all these paradigms and paradigms yet to be invented. There are many free online tutorials that enable you to use graduate level techniques with an undergraduate’s knowledge. Haskell’s language prototyping flexibility makes C look like Brainfuck: you’ll still succeed, but wouldn’t you prefer to do it with more robust, more expressive tools?

How to install Haskell Platform in Ubuntu 10.04 Lucid Lynx

sudo apt-get install ghc ghc-prof cabal-install

SPARK/Ada

Cheetah wearing an M1 helmetThat’s a Cheetah wearing an M1 helmet. Yeah, SPARK/Ada is that goofy. Pascal was considered unsafe for military projects, so they made Ada. Then Ada was considered unsafe for military projects, so they made SPARK, lulz.

Modern languages with type safety include Haskell, OCaml, and Coq. Because our critical systems are too precious for antiquated technology.

Prefix, Postfix, Newfix, Bluefix

My, my, look at all those notations. You can infix: 2 + 2. You can prefix: + 2 2. You can postfix: 2 2 +. All three notations are perfectly arbitrary for the purposes of mathematics. (2 + 2) * 3 = 12, so what’s really different about these codes?

Code 1

x = 2 + 2
y = x * 3
print(y)

Code 2

x := 2 plus: 2
y := x times: 3
y print

Code 3

2 2 + 3 * print

Code 4

(let* (
   (x (+ 2 2))
   (y (* x 3)))
      (print y))

Code 1 is typical of procedural programming (Python1, C, Basic, Pascal). You declare variables, store data in variables, and display variables.

Code 2 is typical of object oriented programming (Java, C++, Ruby2, Smalltalk2). You instantiate objects, send messages to objects, and display objects.

Code 3 is typical of stack programming (Factor, Joy, Forth). You push values onto the stack, push functions onto the stack, and display the stack.

Code 4 is typical of functional programming (Haskell, ML, Erlang, Lisp). You pass expressions, evaluate expressions, and display expressions.

But what’s really different?

Codes 1 and 2 are fundamentally the same: they’re stateful computations. Code 1 stores state in variables, Code 2 stores state in objects, but it’s all the same. Code 2 is syntactical sugar for Code 1. And Code 1 is syntactical sugar for:

Code 0

add x,2,2
mul y,x,3
put y

In other words, you’re still explicitly performing register transactions. Do this calculation then STORE THE RESULT IN HERE. Do that calculation then MOVE THE BYTES TO THERE.

By contrast, Codes 3 and 4 are an island all their own: they’re functional computations. Instead of manipulating machine registers, they manipulate mathematical expressions3. Code becomes data becomes code again; you can pass functions as arguments to other functions as if they were pure mathematical constructs. Code 4 named the expressions “x” and “y”, while Code 3 didn’t, but as Code 3 shows, naming isn’t really necessary, it’s just a convenience.

So what’s really different? Functional programming provides a whole new level of abstraction, one that allows you to write more complex code than you could in low-level languages like Java. Really. Do you like moving bytes around, or would you prefer to leave that as an exercise for the compiler?

1 Python has functional elements, but they’re underused. Most Python programmers aren’t even aware that Python has list comprehensions and anonymous functions.

2 Ditto for Ruby and Smalltalk. They have blocks, and therefore can do functional programming, but it’s not emphasized. With all the OOP hullabaloo, the incentive is to write stateful methods instead of pure methods, especially in Ruby where the line between f and f! blurs because the result of the last evaluated expression is always returned.

3 Most functional languages have stateful capabilities, but they’re strictly only necessary for specialized calculations such as pseudo-random number generation, concurrent programming, and I/O. Also, functional language compilers must be stateful in order to compile functional programs that provide the abstraction of pure, stateless code. (see Haskell).

Haskell Type Transparency

program it patrick

Coding is often a matter of looking up API calls, chaining them together, and outputting the result. In a dynamically typed language like JavaScript, it can be difficult to determine exactly what kind of thing a function returns. For example, a JSON parsing library might have the function:

function parse(data) {
   ...
}

According to the docs, parse() takes JSON text and turns it into a JSON object. But will parse() return a tree literal? A manipulable JSON object? A callback that returns a JSON object? Does parse() return anything at all? Does parse() modify data in place to become a JSON object? What is the type of data anyway? A string? A URL to a location containing a string?

In statically typed languages, types are more transparent:

parse :: String -> Tree String
parse = ...

This Haskell code defines parse as a function which accepts a string and returns a tree of strings. No more slogging through Google results just to see how to use API calls. Haskell’s type strictness is a blessing in disguise; although it may seem inconvenient to specify types up front, in the long run, the code is more readable and more robust.

There’s the JavaScript bayou of playtesting combinations of o.foo(), o.bar(), o.foo(o.bar()) in the interpreter until o responds to one of them.

There’s the scurvy C, where data is could be a long, or a pointer, or a pointer to a pointer, or NULL. The type system is more what you’d call “guidelines” than actual rules.

Then there’s the calm, steady stream of lucidly typed Haskell.

Saving $125M with Haskell

marsOn September 30, 1999, a story broke that NASA’s $125,000,000 Mars Climate Orbiter zoomed out of orbit, hit the atmosphere and burned to a crisp. Why? Because parts of the system used metric units, and other parts used English units. As a rule, science is done with metric units; only tradition and old-fogeyness could cause such a goof.

Unit confusion has an obvious solution: education. Science courses, especially Physics and Chemistry, stress the conversion and explicit marking of units. “5” could mean “5 inches”, “5 centimeters”, or even “5 years”.

But education is just one way to treat the problem. Another is to force scientists, engineers, and programmers to explicitly mark units. Haskell can do this with its type system. It’s called SafeUnits.

data Distance = Cm Double | In Double deriving (Eq)

And you’re ninety percent done. Your code can manipulate centimeters, inches, or both, but any function that takes or returns a Distance must explicitly handle each unit type.

data Farad = Farad Double deriving (Eq)

capacitance :: Distance -> Farad
capacitance (Cm x) = Farad (1.113e-12 * x)

Now you can calculate the capacitance of centimeters, but only in centimeters like a good scientist. Haskell will complain if you try any other unit.

> capacitance (Cm 1)
1.0 cm of capacitance = 1.113e-12 Farad
> capacitance (In 1)
safeunits.hs: safeunits.hs:25:0-41: Non-exhaustive patterns in function capacitance

If you’ve ever raged over a problem that didn’t specify units, you’re among friends.

Caveat 1: SafeUnits does not make inch-capacitance calculation impossible; it simply raises an error for inch-capacitance until the functionality is added.

Caveat 2: Eq is automatically derived, but not Ord. Because comparing units of measurement (<, >) is not something the type system can do intuitively. Declare instances for that kind of functionality.

Caveat 3: SafeUnits does not prevent programmers from writing Cm 1 and thinking In 1

Caveat 4: SafeUnits does not prevent programmers from unboxing Cm 1 to 1 and manipulating the raw Doubles.

And that’s how to use Haskell’s type system as an explicit reminder and strict enforcer of units.

U JELLY NASA?

You might want to read Troll Science to recoop that loss.

JavaScript is the new λ

lambdaAn obsession with programming languages has lead me down the trail of functional programming. Lisp, Haskell, Erlang. They’re so cool! Functional programming reduces typing and increases reusability. Instead of boilerplate for loops, simple maps. Note: This idea is hardly new.

C – Imperative

int *xs = {1, 2, 3};
int *ys = malloc(3 * sizeof(int));

int i;
for (i = 0; i < 3; i++) {
   ys[i] = xs[i] + 1;
}

Haskell – Functional

let ys = map (\x -> x + 1) [1, 2, 3]

JavaScript is usually written imperatively, but it doesn’t have to be. The magical ingredient is anonymous functions, or in JavaScript parlance, “callbacks”.

JavaScript – Imperative

var xs = [1, 2, 3];
var ys = [];
for (var i = 0; i < xs.length; i++) {
   ys.push(xs[i] + 1);
}

JavaScript – Functional

var ys = [1, 2, 3].map(function (x) { return x + 1; }

Node.js is an experiment in server-side JavaScript. Why would someone use JavaScript for server programming? The languages brings asynchronous programming to the table: instead of wasting CPU cycles waiting for I/O, or RAM spawning threads, Node.js uses a simple event-driven model. No cycles are used until an event fires and a callback is executed.

Node.js doesn’t add functional ability to JavaScript, but it does provide a web server, documentation, and a common environment for coding. Like Common Lisp, JavaScript suffers from too many implementations  (V8, JägerMonkey, Chakra, SquirrelFish, Carakan, …) which burdens the programmer with handling low-level details.

To program functionally in Node.js, one should install the underscore library. A simple npm install underscore will do the trick.

Underscore contains many useful functions, including map, reduce, include, max, min, sortBy, first, rest, last, and zip. The last one, zip, is not used for compression but for interleaving sequences. Input: [1, 2, 3], ["Alice", "Bob", "Cathy"]. Output: [[1, "Alice"], [2, "Bob"], [3, "Cathy"]]. This is useful for reorganizing data structures, especially when you want to supply the elements as input to other functions. E.g., function printId(id, name) { console.log("ID: " + id + " Name: " + name); }

zip is handy, but after playing with Haskell for a month, I’ve come to depend on zipWith. This function zips the arguments and applies a function to them.

zipWith(
   function printId(id, name) {
      console.log("ID: " + id + " Name: " + name);
   },
   [1, 2, 3],
   ["Alice", "Bob", "Cathy"]
);

ID: 1 Name: Alice
ID: 2 Name: Bob
ID: 3 Name: Cathy

The same behavior in imperative program requires several more lines of non-reusable code. Underscore currently does not include the zipWith function, but we can easily define it ourselves.

var _ = require("underscore");
function zipWith(f) {
   var args = _.zip.apply(
      null,
      Array.prototype.slice.call(arguments, 1)
   );
   return args.map(function (arg) {
      return f.apply(null, arg);
   });
}

Huff, Huff.

erlang logoErlang has some really cool features, but it also has many detractors. Shebangs fuck it up. The semicolon is one of three finicky line endings. In a language designed to pass messages between processes, there is no POSIX signal capability. I’m tired of spending hours debugging other people’s code, but my code will never work until I document those bugs. If you’re a functional programmer considering Erlang, or an experience Erlang programmer wondering why Pythonistas and Rubyists don’t consider Erlang, read my bug report.

Cabal Woes

cabal logoProblem

Compilation errors? Installation issues? Upgrade issues? Uninstallation issues? Dependency issues? Broken packages? U JELLY? U PUDDI?

Solution

Uninstall Cabal packages

  1. rm -rf ~/.ghc
  2. rm -rf ~/.cabal

Reinstall Haskell Platform

  1. Download Haskell Platform.
  2. Install GHC and Haskell Platform.

Dependency chain companies will go bankrupt!

Broken package companies will go bankrupt!

Hackage companies will be forever in business!

Practical Common Lisp Rolling Review

book coverAs I follow the exercises in Practical Common Lisp (PCL) by Peter Seibel, I’ll post the experiences here. Let’s get started.

The Book Itself

is available in dead-tree format and as a series of HTML files. My new Nook is hungry for ePubs, so I downloaded each PCL chapter, dragged and dropped into Sigil, and tweaked the resulting ePub. Sigil has a horrible bug which shoves code blocks way down pages, so my copy of PCL isn’t looking too great. But I plan to write a CL program to restore its former glory.

Chapter 1

is an introduction designed to welcome programmers to Lisp. A short summary would be: DON’T PANIC. The intro is humorous and intriguing.

Chapter 2

Shit just got real. Now PCL ushers the reader to install Lisp in a box (really Lispbox). Back in 2005, Lispbox was surely a godsend for the budding Common Lisper, but in 2010 it’s not only aged, it’s broken. I’ve already blogged about the pain and agony of setting up a CL system, and I won’t repeat myself. Suffice to say MacPorts is only half a saving grace for Mac users. I’m currently using Aquamacs and Aquamacs SLIME with MacPorts CLISP and Quicklisp besides. My .emacs is pastebinned.

PCL covers the basics of Emacs shortcut keys then jumps to the Lisp interpreter. SLIME is quite helpful in this endeavor, compiling, running, debugging my code for me. I like the PCL’s Hello World is not a program but an expression entered into the REPL. Since half the work will be done there, it’s a good idea to get acquainted with ol’ Reppy.

PCL dives into writing defun’s; introduces the debugger; details the proper loading of source files and optional compilation of same, and ends with a NASA anecdote. A geek such as I couldn’t hope for much more than this chapter.

Chapter 3

The first practical chapter. PCL invites the reader to construct a simple music database, introducing Lisp syntacs and semantics along the way. I completed save-db, load-db, … up to update and delete-rows. I’m happy to report that Seibel’s code is clear and concise. PCL suggests using a combination of select and where to simulate a SQL query, with the effect that I almost believe I’m using a real database.

The section, “Removing Duplication and Winning Big” is not easy to skim over, so I’m stuck there for now. It covers Lisp macros, a topic I’ve only heard in reverent whispers.

I’ll let you in on a secret: I’m concurrently reading Real World Haskell. There are notable differences between Haskell and Common Lisp.

Haskell vs Common Lisp

Haskell has one official version (GHC) available on multiple platforms, free, and open source. Common Lisp has no official version. There are Windows-only versions, commercial versions, Linux-only versions with broken makefiles, and CL install kits that once served a purpose but now only serve to mock the senseless diversity of CL implementations.

Installing a complete Haskell environment is simple and painless. Installing Common Lisp took me on a journey.

GHC incorporates new features in each version. Common Lisp’s lack of an official version means that each flavor includes and omits important features. E.g., some come with multithreading, some don’t. ANSI can’t be as responsive as the Haskell community, so the Common Lisp standard will likely not benefit from new features anytime soon.

Haskell stores functions and variables in the same space. Common Lisp does not; it requires ' and #' to pass functions around.

Haskell syntax is light and uniform. By PCL Ch. 3, Common Lisp syntax is starting to slow me down.

Haskell has monads. Common Lisp has macros. More on that when I learn precisely what they are and how to use them.

Haskell’s error messages are somewhat esoteric. Common Lisp’s are more readable.

GHCi’s REPL has enough features to satisfy most programmers. Common Lisp REPLs often fail to include expression history, backspace functionality, and other obvious REPL features. CL REPLs are so crippled that Emacs/SLIME is a necessity.

Both Haskell and Common Lisp syntax are unique, requiring special effort to understand. I believe this is because BASIC syntax is more natural. STEP 1 STEP 2 STEP 3 is intuitive for humans. On the other hand, linear procedures aren’t very powerful. Functional languages have power because they break out of the STEP 1 STEP 2 STEP 3 routine: Spaghetti code is the result of a procedural programmer attempting to do more complex work.

Back to PCL

I’m not sure exactly when I need to #' functions in Common Lisp. While following the PCL exercises, I’ve left out #' for lambdas, and my code gives the same results as Seibel’s. It stands to reason that lambdas don’t need to be #'ed since they self-evaluate. Allegro and CLISP don’t need #' for lambdas, but I’m not sure about other CLs. In any case, I’d like to see a note about this in PCL. Also, I wonder why multiple quote syntaxes exist: ', `, and quote.

Common Lisp Syntax

Look at the code for generating a database query. There are SIX different symbols, each with their own purpose and usage.

(defmacro where (&rest clauses)
  `#'(lambda (cd)
    (and ,@(make-comparison-expr-list clauses))))

The ampersand in &rest is CL’s way of handling function arguments. &key would interpret the arguments as a property list, while &rest stores all the arguments in clauses.

The #' references the lambda without evaluating it. That is, pound (#) references an unevaluated (') lambda object.

The comma (,) and at (@) work together to force evaluation inside the unevaluated (`#') lambda expression. The lambda would have been prefaced with '#', but ,@ only works inside the backtick quote (`).

No poofters.