Learning Haskell the Hard Way

26 July 2014

When I was reading the collection of learning resources on Haskell and tried to find a good start, I quickly realized that none of the books or tutorials are suitable for me: the easier a tutorial claims to be, the harder to really understand Haskell by reading it. What I need is a terse documentation that introduces the syntax and semantics of Haskell systematically and clearly, but unfortunately none was found.

I know I have to try the hard way: reading the Haskell language specification directly and absorb it myself. To make the process less dull and record my progress, I will write down my learning notes here incrementally.

Overview

A Haskell program is organized with four levels: modules, declarations, expressions & lexical structures, but the specification is organized in the reverse order.

Haskell has ad hoc polymorphism (overloading) and parametric polymorphism (Hindley-Milner type structure).

Haskell has six namespaces:

for value
- variable
- constructor
for type entity
- type variable
- type constructor
- type class
module

The same name can be reused without conflicts as long as they are in different namespaces.

Lexical Structure

A Haskell program is composed of lexemes (tokens) and whitespaces.

program → { lexeme | whitespace }

Whitespace includes:

The complete set of whitespace in both ASCII & Unicode
Two kinds of comments
- inline comment starts with --
- nested comment wrapped by {- -}

A lexeme is one of the following:

identifier
- qvarid: (qualified) variable identifier
- qconid: (qualified) constructor identifier
- reservedid: reserved identifier
operator
- qvarsym: (qualified) variable (symbolic) operator
- qconsym: (qualified) constructor (symbolic) operator
- reservedop: reserved operator
literal: integer, float, character or string literal
special: one of special symbols ()[]{}`,;

A variable and a constructor is distinguished by the first character and put into different namespaces:

identifier
- variable: lower case (including underscore)
- constructor: upper case
operator
- variable: non-colon
- constructor: colon :

A variable or constructor can contain symbol ', so the common mathematical term “x prime” can be represented as x'.

By using layout rule (indentation), symbols {}; can be omitted in sereral grammer productions.

Expressions

Parentheses

(exp) is a parenthesized expression, and is equivalent to exp.

Function & operator application

Function is prefixed and curried, so f x y means (f x) y.

fexp    → [fexp] aexp

All operators are infixed except prefix negation -.

infixexp→ lexp qop infixexp
        | - infixexp    (prefix negation)
        | lexp
qop     → qvarop | qconop

An operator can be converted to prefix notation by parentheses. e.g.

(+) x y

Reversely, a function identifier (either variable or constructor) can be converted to an infix operator by backquotes.

x `op` y

List & Tuple

List is constructed with :.

1:2:3:[]   or   (:) 1 ((:) 2 ((:) 3 []))

Arithmetic sequence is another way to construct a list.

[1,3..6] == [1,3,5]
  [1..6] == [1,2,3,4,5,6]

Tuple is constructed with (,...,).

(1, 2, "a")   or   (,,) 1 2 "a"

Field label

A field label is used to give a name to a field in a datatype. It can be used to construct, extract and update the field.

A constructor with labeled fields:

aexp    → qcon { fbind1 , … , fbindn }  (n ≥ 0)
fbind   → qvar = exp

Pattern matching

A pattern itself is not an expression, but it is an important part of sereral expressions, including: lambda abstractions, function definitions, let expressions, list comprehensions, do expressions and case expressions.

Pattern matching is used to deconstruct values according to their type specification. It proceeds from left to right, and outside to inside. Attempting to match a pattern can have one of three results:

Fail
Succeed: returning a binding for each variable in the pattern
Diverge: i.e. return ⊥

The syntax for patterns covers a subset of expressions.

A pattern can match against infix expressions, but limited to infix constructors (the operator must be qconop).

pat     → lpat qconop pat (infix constructor)

A pattern can match against constructor functions (with or without field labels).

pat     → ...
        | lpat

lpat    → apat
        | gcon apat1 … apatk    (arity gcon  =  k, k ≥ 1) 

apat    → ...
        | gcon  (arity gcon  =  0)
        | qcon { fpat1 , … , fpatk }    (labeled pattern, k ≥ 0)

fpat    → qvar = pat

A pattern can match against a variable, a literal, a parenthesized expression, a tuple or a list.

lpat    → var ...
        | - (integer | float)   (negative literal) 

apat    → ...
        | literal
        | ( pat )               (parenthesized pattern)
        | ( pat1 , … , patk )   (tuple pattern, k ≥ 2)
        | [ pat1 , … , patk ]   (list pattern, k ≥ 1)

The variables defined within the pattern can be binded, but how to name and bind the whole pattern? This is exactly what an as pattern does (var before @ is the name for the whole pattern).

apat    → var [ @ apat] (as pattern)

Wildcard is used when you need a variable placeholder but do not want to bind the value to a name.

apat    → ...
        | _     (wildcard)

Sometimes you need a pattern that can never fail (only succeed or diverge), it is called a irrefutable pattern.

apat    → ...
        | ~ apat        (irrefutable pattern)

Besides ~apat, these patterns are also irrefutable: a variable, a wildcard, N apat where N is a constructor defined by newtype and apat is irrefutable, var@apat where apat is irrefutable. All other patterns are refutable.

Case expression

Case expression is very important because all other pattern matching expressions ultimately translate into case expressions.

A case expression has one or more alternatives.

lexp    → case exp of { alts }
alts    → alt1 ; … ; altn       (n ≥ 1)

An alternative is either a pattern or empty. The pattern either coresponds to an expression (body) directly, or has one or more guarded patterns (note an optional gdpat appears at the right side of itself). A guarded pattern starts with | and is composed of one or more guards.

alt     → pat -> exp [where decls]
        | pat  gdpat [where decls]
        | (empty alternative) 
gdpat   → guards -> exp [ gdpat ]
guards  → | guard1, …, guardn   (n ≥ 1)
decls   → { decl1 ; … ; decln } (n ≥ 0)

Note that each alternative can optionally have a where declaration. It is used to bind extra variables to be used in the local scope.

There are two types of guards: pattern guard & boolean guard, and local declarations can also be introduced together with guards.

guard   → pat <- infixexp     (pattern guard)
        | let decls           (local declaration)
        | infixexp            (boolean guard)

This is how a case expression works:

The expression after case is matched against each alternative till a match is found.
Then each guarded pattern in the matched alterantive is tested till one passes. A guarded pattern passes if and only if all of its guards pass.
If successful, the conresponding expression is returned, otherwise, the next guarded pattern or alternative is tried sequentially.
If no match can be found, the result is ⊥.

Lambda expression

Lambda is used to construct a function without a name. Besides a variable, the input can also be any pattern.

lexp    → \ apat1 … apatn -> exp        (n ≥ 1)

An example:

(\ (x, y) -> x+y) (3, 7)

Also note that lambda -> associates to the right, e.g.

Integer ->  Integer -> Integer
    is equivalent to
Integer -> (Integer -> Integer)

Let expression

A let expression introduces a nested, lexically-scoped, mutually-recursive list of declarations. exp after keyword in is the value of a let expression.

lexp    → let decls in exp
decls   → { decl1 ; … ; decln } (n ≥ 0)

Some examples:

let {} in 42
let {(x,y) = (1,2)} in x+y

List comprehension

A list comprehension constructs a list of elements represented by exp qualified by one or more qualifiers.

aexp    → [ exp | qual1 , … , qualn ]   (n ≥ 1)

There are three kinds of qualifiers.

A generator is composed of a pattern (pat with type t) and a list (e with type [t]). The pattern is matched against and binded to each element of the list one by one, so the binded variables can be used to generate each element of the result list. The generators are evaluated in a nested, depth-first order, and a failed match is just skipped over.

qual    → pat <- e      (generator)

A qualifier can also be a local declaration to bind auxiliary variables, or a boolean guard to exclude some elements from the list.

qual    → ...
        | let decls     (local declaration)
        | exp           (boolean guard)

Do expression

Wait till monad is fully understood.

Type signature

Haskell has type inference, but an expression can be optionally specified with a type signature.

exp     → infixexp :: [context =>] type

Declarations

There are top declarations (topdecl) that are only allowed at the top level of a module, and nested declarations (decl) that may be used either at the top level or in nested scopes.

topdecl → ...
        | ...
        | decl

Top declarations

A top declaration starts with one of the keywords: data, type, newtype, class, instance, default and foreign.

data, type and newtype is for declaring new types. class is for declaring typeclasses and instance is for declaring the membership between types and typeclasses.

They are explained later one by one.

Nested declarations

In the last section, decl appears in the let expression and where clause without any explanations. Actually, they are nested declarations.

There are five kinds of nested declarations: type signature, fixity, function binding, pattern binding and empty declartions.

A type signature simply specifies types for variables.

decl    → gendecl
        | ...
gendecl → vars :: [context =>] type
vars    → var1 , …, varn        (n ≥ 1)

A fixity declaration gives the fixity and binding precedence of one or more operators.

gendecl → ...
        | fixity [integer] ops
fixity  → infixl | infixr | infix
ops     → op1 , … , opn         (n ≥ 1)
op      → varop | conop

The right hand side (rhs) of function and pattern bindings are the same.

decl    → ...
        | (funlhs | pat) rhs

The syntax of rhs is almost the same as case expression, except -> is replaced by =.

rhs     → = exp [where decls]
        | gdrhs [where decls]
gdrhs   → guards = exp [gdrhs]

A function can be binded in multiple function binding declarations, as long as they are contiguous and the number of patterns is the same. In each of the declaration, the left hand side supports three alternative syntaxes.

funlhs  → var apat { apat }
        | pat varop pat
        | ( funlhs ) apat { apat }

And an example:

plus x y z = x+y+z
x ‘plus‘ y = \ z -> x+y+z
(x ‘plus‘ y) z = x+y+z

Type

Just as an expression may contain variables and denotes a value, a type expression may contain type variables and denotes a type value (but it is evaluated at compile time, unlike an expression evaluated at run time).

Type expressions are designed to have similar representations as their corresponding expressions.

Function type can be represented in infix or prefix notation. Like function application, type application (btype) is also prefixed and curried. atype is the type expression excluding infix function type and type application.

type    → btype [-> type]       (function type)
btype   → [btype] atype         (type application)
atype   → gtycon
        | ...
gtycon  → ...
        | (->)  (function constructor)

A type variable is an identifier beginning with a lowercase letter. A parenthesized type (t) is identical to the type t. A type constructor (type template) as an identifier begins with an uppercase letter. Besides function type, the syntaxes for tuple and list are also special syntaxes.

atype   → gtycon
        | tyvar
        | ( type1 , … , typek ) (tuple type, k ≥ 2)
        | [ type ]              (list type)
        | ( type )              (parenthesised constructor)
gtycon  → qtycon
        | ()            (unit type)
        | []            (list constructor)
        | (->)          (function constructor)
        | (,{,})        (tupling constructors)

Kind

What is *, * -> *…? An ordinary type has kind *. A type constructor (template) that has one type argument has kind * -> *, e.g. a list. So on and so forth.

Context

The context (context => type) is used to indicate that type tyvar belongs to class qtycls.

A context is composed of zero or more class assertions.

context → class
        | ( class1 , … , classn )               (n ≥ 0)

And a class assertion is either either a type variable, or the application of type variable to one or more types.

class   → qtycls tyvar
        | qtycls ( tyvar atype1 … atypen )      (n ≥ 1)
qtycls  → [ modid . ] tycls
tycls   → conid
tyvar   → varid

Algebraic data type

algebraic data type is named so because it is composed of other types by product and sum (algebraic operations). It introduces a new type constructor (simpletype) with zero or more constituent data constructors (constrs).

topdecl → ...
        | data [context =>] simpletype [= constrs] [deriving]

The type constructor begins with upper case letter and may have zero to more type variables as parameters. The type constructor then can be used in curried type application in a type expression.

simpletype      → tycon tyvar1 … tyvark         (k ≥ 0)

On the right side of =, sum (alternative) types are separated by |.

constrs → constr1 | … | constrn         (n ≥ 1)

For each constr, there are three alternative syntaxes.

A data constructor followed by zero or more atype.
An infix data constructor operator between two btype.
A data constructor followed by field declarations.

constr  → con [!] atype1 … [!] atypek   (arity con  =  k, k ≥ 0)
        | (btype | ! atype) conop (btype | ! atype) (infix conop)
        | con { fielddecl1 , … , fielddecln }       (n ≥ 0) 
con     → conid | ( consym )    (constructor)

! is strictness flag to indicate that the corresponding constructor argument is eagerly evaluated.

Type synonym

A type synonym is equivalent to its definition are completely interchangeable.

topdecl → type simpletype = type

Newtype

A newtype declaration introduces a distinct type whose representation is the same as an existing type.

topdecl     → newtype [context =>] simpletype = newconstr [deriving]
newconstr   → con atype
            | con { var :: type }

Unlike type synonyms, newtype may be used to define recursive types.
New instances can be defined for a type defined by newtype but may not be defined for a type synonym.
A type created by newtype has an extra level of indirection compared to an algebraic datatype, so the pattern matching rule is different.

Ad hoc polymorphism

Ad hoc polymorphism (overloading) is supported by typeclasses.

A class declaration introduces a new class and the operations (methods) on it.

Modules

To be continued…

About 301 Moved Permanently

25 July 2014

When building a website, there is one inevitable thing: 301 permanent redirection. The cases that have to involve 301 includes:

Redirection from www subdomain to naked domain or vise versa.
Redirection from slashed pretty URL to unslashed URL or vise versa.

301 is easy to implement with Go:

func redirect301(w http.ResponseWriter, url string) {
    w.Header().Set("Location", url)
    w.WriteHeader(http.StatusMovedPermanently)
}

There is one more thing that needs attention: The root path of a domain always contains a slash (GET / in HTTP request), regardless the user enters the slash or not, so the root path needs no redirection.

How to Achieve a Perfect PageSpeed Insights Score

24 July 2014

Indtroduction

PageSpeed Insights is an online tool by Google to measure the performance of an web page for mobile and desktop devices. It has a set of heuristic rules considering the network-independent aspects of page performance. Each rule has a weight and the total score ranges from 0 to 100 points. The desktop and mobile tests have the same set of rules for performance and mobile test has some extra rules about user experience.

Though it might be too rigorous to require ¹⁰⁰⁄₁₀₀ score for a web site, it is a good way to learn those rules to actually achieve it, and it does have a significant improvement to the load time of my site (load time reduces from more than 1 sec to about 500 ms in the Pingdom Website Speed Test). In the following section, I would like to share what I have learned by optimizing this website to achieve a perfect score, rule by rule.

Speed rules

Avoid landing page redirects

The redirection to mobile version can be avoided by responsive design (CSS media queries).

You cannot avoid redirections completely, e.g. I have a 301 redirection from www subdomain to the naked domain, but the naked domain does not have any further redirection.

Eliminate render-blocking JavaScript and CSS in above-the-fold content

This might be the hardest rule to fully comply with.

Firstly, you need to minimize and fully inline your CSS into HTML, and your CSS cannot be too large. (My CSS is 22kB after minimization). If you use web fonts, you should directly write the @font-face in your inlined CSS, rather than importing an external CSS.

Secondly, you need to inline or asynchronously load all your scripts. The “async” attribute is not very useful because it cannot control the order of the execution of multiple scripts. I have found HeadJS to solve the problem. HeadJS is inlined into the HTML (3.7kB), and other scripts are loaded asychronously but executed in order.

Finally, you have to make sure the delayed scripts will not affect the layout of above-the-fold content. You might have to make some trade off and cut off some functionality.

Enable compression, Leverage browser caching & Reduce server response time

Since I uses Google App Engine, the gzip compression should already been enabled by default, the expiration time for static resources are configurable via config.yaml and GAE responses fast enough.

Minify Resources (HTML, CSS, and JavaScript)

The CSS is minified by Compass, and scripts imported are already minified. The HTML file is not minified but it seems not serious enough to trigger a test warning by PageSpeed.

Optimize images

I use font icons in my site. When there has to be an image, use PNG compressed with OptiPNG.

Reduce the size of the above-the-fold content

It basically depends on your design and just remember the YAGNI principle.

User experience rules

These rule only applies to mobile platforms.

Avoid plugins

None of them on my site.

Configure the viewport

This is part of the responsive design.

Size content to viewport

An important lesson: never write a URL directly in an article because URL does not wrap well and could overflow out of the viewport width. instead, give the URL a name composed of words.

Size tap targets appropriately & Use legible font sizes

I have written some lists of links in my notes, which makes those links close to each other. To make them separate enough to pass this rule, the line height cannot be too small, here is the minimal requirement from my tests (font-size/line-height):

16px/28px
17px/28px
18px/29px
19px/30px
20px/30px

Conclusion

It is not rocket science to build a website with good performance. You just needs to tune and optimize every detail of your site patiently. In this process, PageSpeed Insights can help a lot.

An Overview of Continuous Integration Services

17 July 2014

After reviewing many continuous integration services below, there is no doubt that Wercker is the winner. The Wercker Box (virtual container) is based on Ubuntu and any packages can be easily installed with “apt-get”. This flexibility basically allows running any test suites and deploying to any cloud platforms. Furthermore, a Wercker Box can inherit from another box and can be stored and published in Wercker directories as the starting point for any projects, increasing the performance significantly.

Price

Open source friendly:

Has free plan:

No fee plan:

Installing Source Code Pro

17 July 2014

Source Code Pro is one of the best monospace fonts and it is open sourced.

Here is the script for installing it on Linux.

Installing Susy

17 July 2014

The script:

sudo apt-get install ruby
sudo apt-get install ruby-dev
sudo gem install susy -V
sudo gem install compass --pre -V
sudo gem lnstall breakpoint -V

Notes:

The current version of Susy uses a higher version of Sass than Compass, so the corresponding Compass version has to be a prerelease (–pre).
The installing process is slow and it feels better to print it out (-V).
You may have to run “compass watch” with sudo (issue 1497).
An alternative RubyGems source that is close to your location might help.

A List of URLs about Web Design

10 July 2014