The Promise of Statically Typed Functional Programming: Broader Context and Language Comparison

Konstantin Läufer (co-presenter)
George K. Thiruvathukal (co-presenter)
Joe Kaylor
Emerging Technologies Lab
Loyola University Chicago

Tuesday, 24 April 2012

Applied Mathematics Colloquium
Loyola University Chicago

Friday, 20 April 2012

Workshop on Functional Programming in Quantitative Finance
The Stevanovich Center for Financial Mathematics
The University of Chicago

Overview

Languages and paradigms

Computational models

Type systems

Specific functional languages

Conclusion

Early History of Programming Languages

Some Paradigm-Defining Programming Languages

How to choose from this Tower of Babel?

Dominant Programming Paradigms

So why are things so imperative?

Each deserves a look.

Turing Machine, 1936

Alan Turing’s “abstract machine” that predates modern programmable computers.

von Neumann, EDVAC Report, 1945

Von Neumann was influential in many endeavors (mathematics, physics, game theory, and computing) and was involved in the Manhattan project and the EDVAC computer (successor to the ENIAC of Eckhert and Mauchly).

Three ideas that have shaped modern computer engineering:

These views of computing are identical and were independently conceived by Babbage in the design of the Analytical Engine.

How to Choose?!?

Zoom Out: Fundamental Computational Models

These three models are equivalent in computational power.

Church-Turing thesis: these three models express anything that is algorithmically computable.

Lambda Calculus

<Exp> ::= <ident>
        | lambda <ident> . <Exp>  -- function abstraction
        | <Exp> <Exp>             -- function application
        | <constant>         -- needed "only in practice"

 (\f (x, y) -> (f x, f y)) (*2) (2, 3) -> (4, 6)

Sufficiently powerful to define everything within.

By encoding.

Yes, including recursion.

Make it Safe: Type Systems in Programming Languages

Type System Dimensions

Static Type Safety

Timeline of the Language/Type System Pendulum

ThoughtWorks Technology Radar: Languages Section

Languages

Languages

Main Statically Typed Contenders: Haskell, F#, Scala

All have powerful type systems, but Haskell the most.

All have support for concurrency.

Interesting Dynamically Typed Contenders

Running Example: Process Tree

Transform this

laufer   20736 25090  0 Apr16 pts/5    00:00:00 /bin/bash
...
laufer   21274 20736  0 Apr16 pts/5    00:00:01 emacs -nw test2bb.txt
laufer   21281 21274  0 Apr16 ?        00:00:00 /usr/bin/aspell -a -m -B --encoding=utf-8
...
laufer   25090     1  0 Apr15 ?        00:01:18 guake -OO /usr/lib/guake/guake.py
...
laufer   28849 25090  0 11:24 pts/9    00:00:00 /bin/bash

to this

 25090: guake -OO /usr/lib/guake/guake.py
  20736: /bin/bash
   21274: emacs -nw test2bb.txt
    21281: /usr/bin/aspell -a -m -B --encoding=utf-8
  28849: /bin/bash

Edge reversal!

In Java

private final Map<Integer, String> pMap = new HashMap<Integer, String>();
private final Map<Integer, List<Integer>> tMap = new HashMap<Integer, List<Integer>>();

    String line = null;
    while ((line = in.readLine()) != null) {
        final Process p = parser.parseString(line);
        pMap.put(p.pid, p.cmd);
        if (! tMap.containsKey(p.ppid))
            tMap.put(p.ppid, new ArrayList<Integer>(CHILD_LIST_SIZE));
        tMap.get(p.ppid).add(p.pid);                

// now print tMap as a tree

Languages with name-based typing require a name for every type! Verbose, not agile.

In a Typical Dynamic Language

pMap = new HashMap
tMap = new HashMap

    line = null;
    while ((line = in.readLine()) != null) {
        p = processParser.parseString(line);
        pMap.put(p.pid, p.cmd);
        if (! tMap.containsKey(p.ppid))
            tMap.put(p.ppid, new ArrayList(CHILD_LIST_SIZE));
        tMap.get(p.ppid).add(p.pid);                

// now print tMap as a tree

Fictitious language. (I kind of leapfrogged over dynamic languages.)

Type erasure of Java example.

In F# Using .NET Collections

let cMap = new Dictionary<int, String>()
let tMap = new Dictionary<int, ICollection<int>>()
let bOut = new StreamWriter(new BufferedStream(Console.OpenStandardOutput()))

let mutable line = Console.ReadLine()
while line <> null do
    let (words, pid, ppid) = // ...parse from line...
    cMap.[pid] <- line.Substring(iCmd)
    if not (tMap.ContainsKey(ppid)) then
        tMap.[ppid] <- new List<int>(8)
    tMap.[ppid].Add(pid)
    line <- Console.ReadLine()    
    
// now print tMap as a tree

Looks like a C# program.

Performs like a C# program!

In F# Using F#'s Built-In Collections

let rec lines = seq {
    let line = Console.ReadLine()
    if line <> null then yield line ; yield! lines
}

let parse (line: string) = // ...

let ps = lines |> Seq.map parse |> List.ofSeq
let cMap = ps |> List.map (fun (pid, _, cmd) -> (pid, cmd)) |> Map.ofList
let tMap = ps |> Seq.ofList |> Seq.groupBy (fun(_, ppid, _) -> ppid) |> Map.ofSeq

// now print tMap as a tree

Compose computations at the collection level.

The only mutable state is I/O.

About 50% performance overhead over previous version. In part because these maps are tree-based.

In Scala

val lines = scala.io.Source.fromInputStream(System.in).getLines
val out = new BufferedWriter(new OutputStreamWriter(System.out), IO_BUF_SIZE);
val parse = parseLine(lines.next())

val ps = lines map parse toList
val pmap = ps map { case (pid, _, cmd) => (pid, cmd) } toMap
val tmap = ps groupBy (_._2)

// now print tMap as a tree

Compose computations at the collection level.

The only mutable state is I/O.

In Haskell

printProcessTree :: [String] -> [String]
printProcessTree (header : lines) = 
  let 
    ps = map (parser header) lines
    pmap = IntMap.fromList ps
    tmap = IntMap.fromListWith (++) $ map (\(_, (pid, ppid, _)) -> (ppid, [pid])) ps
    showTrees l i = -- ...convert tmap to strings...
  in
    showTrees 0 0

main :: IO()
main = interact $ unlines . printProcessTree . lines

Compose computations at the collection level.

The only mutable state is I/O.

But even this is hidden in interact using dependency injection.

Monadic Style in Haskell

Different example: echo back input lines and print how many were there.

mainLoop :: Int -> IO ()
mainLoop count = count `seq` do
  eof <- isEOF
  if eof
    then
      putStrLn $ show count
    else do
      line <- getLine
      putStrLn line
      mainLoop $ count + 1

main :: IO ()
main = mainLoop 0

Looks like imperative code.

Mutable state of I/O is safely partitioned into I/O monad.

Higher-Kinded Typing in Haskell (and Scala)

Haskell has no subtype polymorphism, duck typing, or the like.

But...

fmap (*2) (Just 3) -> Just 6

let t = Node 1 [Node 2 [], Node 3 [Node 4 [], Node 5 []]]
fmap (*2) t -> Node 2 [Node 4 [], Node 6 [Node 8 [], Node 10 []]]

...it has typeclasses for systematic function overloading.

class Functor f where
  fmap :: Functor f => (a -> b) -> f a -> f b

Example of an abstraction over type constructors (generic types): f is Maybe a, Tree a, etc.

Scala: scalaz

F#: upstream in F* (experimental research language)

So How to Choose?

Basic Intrinsic Criteria

All three languages score very similarly on all of these!

Matt's Extrinsic Criteria from Consulting Firm

Other Interesting Criteria

Popularity of Languages

Top languages on GitHub

Top languages on GitHub

Tiobe index at www.tiobe.com

Performance

Both using Mono on Linux.

Assessment of Main Contenders: Haskell

Assessment of Main Contenders: Scala

Assessment of Main Contenders: F#

Conclusion

Questions, Comments, Discussion...