• Nu S-Au Găsit Rezultate

Haskell 98: language and libraries

N/A
N/A
Protected

Academic year: 2022

Share "Haskell 98: language and libraries"

Copied!
277
0
0

Text complet

(1)

Haskell 98 Language and Libraries The Revised Report

Simon Peyton Jones (editor)

Copyright notice.

The authors and publisher intend this Report to belong to the entire Haskell community, and grant permission to copy and distribute it for any purpose, provided that it is reproduced in its entirety, including this Notice. Modified versions of this Report may also be copied and distributed for any purpose, provided that the modified version is clearly presented as such, and that it does not claim to be a definition of the language Haskell 98.

(2)
(3)

Contents

I The Haskell 98 Language 1

1 Introduction 3

1.1 Program Structure . . . 3

1.2 The Haskell Kernel . . . 4

1.3 Values and Types . . . 4

1.4 Namespaces . . . 5

2 Lexical Structure 7 2.1 Notational Conventions . . . 7

2.2 Lexical Program Structure . . . 8

2.3 Comments . . . 9

2.4 Identifiers and Operators . . . 9

2.5 Numeric Literals . . . 11

2.6 Character and String Literals . . . 12

2.7 Layout . . . 13

3 Expressions 15 3.1 Errors . . . 17

3.2 Variables, Constructors, Operators, and Literals . . . 17

3.3 Curried Applications and Lambda Abstractions . . . 19

3.4 Operator Applications . . . 19

3.5 Sections . . . 20

3.6 Conditionals . . . 21

3.7 Lists . . . 21

3.8 Tuples . . . 22

3.9 Unit Expressions and Parenthesized Expressions . . . 22

3.10 Arithmetic Sequences . . . 23

3.11 List Comprehensions . . . 23

3.12 Let Expressions . . . 24

3.13 Case Expressions . . . 25

3.14 Do Expressions . . . 26

3.15 Datatypes with Field Labels . . . 27

3.15.1 Field Selection . . . 27

3.15.2 Construction Using Field Labels . . . 28

3.15.3 Updates Using Field Labels . . . 29 i

(4)

ii CONTENTS

3.16 Expression Type-Signatures . . . 30

3.17 Pattern Matching . . . 30

3.17.1 Patterns . . . 30

3.17.2 Informal Semantics of Pattern Matching . . . 31

3.17.3 Formal Semantics of Pattern Matching . . . 34

4 Declarations and Bindings 37 4.1 Overview of Types and Classes . . . 38

4.1.1 Kinds . . . 39

4.1.2 Syntax of Types . . . 39

4.1.3 Syntax of Class Assertions and Contexts . . . 41

4.1.4 Semantics of Types and Classes . . . 42

4.2 User-Defined Datatypes . . . 43

4.2.1 Algebraic Datatype Declarations . . . 43

4.2.2 Type Synonym Declarations . . . 45

4.2.3 Datatype Renamings . . . 46

4.3 Type Classes and Overloading . . . 47

4.3.1 Class Declarations . . . 47

4.3.2 Instance Declarations . . . 49

4.3.3 Derived Instances . . . 51

4.3.4 Ambiguous Types, and Defaults for Overloaded Numeric Operations . . . 51

4.4 Nested Declarations . . . 53

4.4.1 Type Signatures . . . 53

4.4.2 Fixity Declarations . . . 54

4.4.3 Function and Pattern Bindings . . . 55

4.4.3.1 Function bindings . . . 56

4.4.3.2 Pattern bindings . . . 57

4.5 Static Semantics of Function and Pattern Bindings . . . 58

4.5.1 Dependency Analysis . . . 58

4.5.2 Generalization . . . 59

4.5.3 Context Reduction Errors . . . 59

4.5.4 Monomorphism . . . 60

4.5.5 The Monomorphism Restriction . . . 61

4.6 Kind Inference . . . 64

5 Modules 65 5.1 Module Structure . . . 66

5.2 Export Lists . . . 66

5.3 Import Declarations . . . 69

5.3.1 What is imported . . . 69

5.3.2 Qualified import . . . 70

5.3.3 Local aliases . . . 70

5.3.4 Examples . . . 71

5.4 Importing and Exporting Instance Declarations . . . 71

5.5 Name Clashes and Closure . . . 72

(5)

CONTENTS iii

5.5.1 Qualified names . . . 72

5.5.2 Name clashes . . . 72

5.5.3 Closure . . . 74

5.6 Standard Prelude . . . 74

5.6.1 ThePreludeModule . . . 75

5.6.2 Shadowing Prelude Names . . . 75

5.7 Separate Compilation . . . 76

5.8 Abstract Datatypes . . . 76

6 Predefined Types and Classes 79 6.1 Standard Haskell Types . . . 79

6.1.1 Booleans . . . 79

6.1.2 Characters and Strings . . . 79

6.1.3 Lists . . . 80

6.1.4 Tuples . . . 80

6.1.5 The Unit Datatype . . . 81

6.1.6 Function Types . . . 81

6.1.7 The IO and IOError Types . . . 81

6.1.8 Other Types . . . 81

6.2 Strict Evaluation . . . 81

6.3 Standard Haskell Classes . . . 82

6.3.1 The Eq Class . . . 82

6.3.2 The Ord Class . . . 84

6.3.3 The Read and Show Classes . . . 85

6.3.4 The Enum Class . . . 86

6.3.5 The Functor Class . . . 87

6.3.6 The Monad Class . . . 88

6.3.7 The Bounded Class . . . 89

6.4 Numbers . . . 89

6.4.1 Numeric Literals . . . 90

6.4.2 Arithmetic and Number-Theoretic Operations . . . 90

6.4.3 Exponentiation and Logarithms . . . 91

6.4.4 Magnitude and Sign . . . 92

6.4.5 Trigonometric Functions . . . 93

6.4.6 Coercions and Component Extraction . . . 93

7 Basic Input/Output 95 7.1 Standard I/O Functions . . . 95

7.2 Sequencing I/O Operations . . . 97

7.3 Exception Handling in the I/O Monad . . . 98

8 Standard Prelude 101 8.1 PreludePreludeList . . . 115

8.2 PreludePreludeText . . . 121

8.3 PreludePreludeIO . . . 125

(6)

iv CONTENTS

9 Syntax Reference 127

9.1 Notational Conventions . . . 127

9.2 Lexical Syntax . . . 128

9.3 Layout . . . 130

9.4 Literate comments . . . 134

9.5 Context-Free Syntax . . . 136

10 Specification of Derived Instances 141 10.1 Derived instances ofEqandOrd. . . 142

10.2 Derived instances ofEnum . . . 142

10.3 Derived instances ofBounded. . . 143

10.4 Derived instances ofReadandShow . . . 143

10.5 An Example . . . 145

11 Compiler Pragmas 147 11.1 Inlining . . . 147

11.2 Specialization . . . 147

II The Haskell 98 Libraries 149 12 Rational Numbers 151 12.1 LibraryRatio . . . 153

13 Complex Numbers 155 13.1 LibraryComplex. . . 156

14 Numeric 159 14.1 Showing functions . . . 160

14.2 Reading functions . . . 161

14.3 Miscellaneous . . . 161

14.4 LibraryNumeric. . . 161

15 Indexing Operations 169 15.1 Deriving Instances ofIx . . . 170

15.2 LibraryIx. . . 172

16 Arrays 173 16.1 Array Construction . . . 174

16.1.1 Accumulated Arrays . . . 174

16.2 Incremental Array Updates . . . 175

16.3 Derived Arrays . . . 176

16.4 LibraryArray . . . 176

(7)

CONTENTS v

17 List Utilities 179

17.1 Indexing lists . . . 182

17.2 “Set” operations . . . 182

17.3 List transformations . . . 183

17.4 unfoldr . . . 183

17.5 Predicates . . . 184

17.6 The “By” operations . . . 184

17.7 The “generic” operations . . . 185

17.8 Further “zip” operations . . . 185

17.9 LibraryList . . . 186

18 Maybe Utilities 193 18.1 LibraryMaybe . . . 194

19 Character Utilities 195 19.1 LibraryChar . . . 197

20 Monad Utilities 201 20.1 Naming conventions . . . 203

20.2 ClassMonadPlus . . . 203

20.3 Functions . . . 204

20.4 LibraryMonad . . . 206

21 Input/Output 209 21.1 I/O Errors . . . 212

21.2 Files and Handles . . . 213

21.2.1 Standard Handles . . . 213

21.2.2 Semi-Closed Handles . . . 214

21.2.3 File locking . . . 214

21.3 Opening and Closing Files . . . 214

21.3.1 Opening Files . . . 214

21.3.2 Closing Files . . . 215

21.4 Determining the Size of a File . . . 215

21.5 Detecting the End of Input . . . 215

21.6 Buffering Operations . . . 215

21.6.1 Flushing Buffers . . . 217

21.7 Repositioning Handles . . . 217

21.7.1 Revisiting an I/O Position . . . 217

21.7.2 Seeking to a new Position . . . 217

21.8 Handle Properties . . . 218

21.9 Text Input and Output . . . 218

21.9.1 Checking for Input . . . 218

21.9.2 Reading Input . . . 218

21.9.3 Reading Ahead . . . 219

21.9.4 Reading The Entire Input . . . 219

21.9.5 Text Output . . . 219

(8)

vi CONTENTS

21.10Examples . . . 219

21.10.1 Summing Two Numbers . . . 219

21.10.2 Copying Files . . . 220

21.11LibraryIO. . . 221

22 Directory Functions 223 23 System Functions 229 24 Dates and Times 231 24.1 LibraryTime . . . 234

25 Locale 239 25.1 LibraryLocale . . . 240

26 CPU Time 241 27 Random Numbers 243 27.1 TheRandomGenclass, and theStdGengenerator . . . 245

27.2 TheRandomclass . . . 247

27.3 The global random number generator . . . 248

References . . . 249

Index . . . 251

(9)

PREFACE vii

Preface

“Some half dozen persons have written technically on combinatory logic, and most of these, including ourselves, have published something erroneous. Since some of our fel- low sinners are among the most careful and competent logicians on the contemporary scene, we regard this as evidence that the subject is refractory. Thus fullness of expo- sition is necessary for accuracy; and excessive condensation would be false economy here, even more than it is ordinarily.”

Haskell B. Curry and Robert Feys in the Preface to Combinatory Logic [2], May 31, 1956

In September of 1987 a meeting was held at the conference on Functional Programming Languages and Computer Architecture (FPCA ’87) in Portland, Oregon, to discuss an unfortunate situation in the functional programming community: there had come into being more than a dozen non-strict, purely functional programming languages, all similar in expressive power and semantic underpin- nings. There was a strong consensus at this meeting that more widespread use of this class of functional languages was being hampered by the lack of a common language. It was decided that a committee should be formed to design such a language, providing faster communication of new ideas, a stable foundation for real applications development, and a vehicle through which others would be encouraged to use functional languages. This document describes the result of that com- mittee’s efforts: a purely functional programming language called Haskell, named after the logician Haskell B. Curry whose work provides the logical basis for much of ours.

Goals

The committee’s primary goal was to design a language that satisfied these constraints:

1. It should be suitable for teaching, research, and applications, including building large systems.

2. It should be completely described via the publication of a formal syntax and semantics.

3. It should be freely available. Anyone should be permitted to implement the language and distribute it to whomever they please.

4. It should be based on ideas that enjoy a wide consensus.

5. It should reduce unnecessary diversity in functional programming languages.

(10)

viii PREFACE

Haskell 98: language and libraries

The committee intended that Haskell would serve as a basis for future research in language design, and hoped that extensions or variants of the language would appear, incorporating experimental features.

Haskell has indeed evolved continuously since its original publication. By the middle of 1997, there had been four iterations of the language design (the latest at that point being Haskell 1.4). At the 1997 Haskell Workshop in Amsterdam, it was decided that a stable variant of Haskell was needed;

this stable language is the subject of this Report, and is called “Haskell 98”.

Haskell 98 was conceived as a relatively minor tidy-up of Haskell 1.4, making some simplifications, and removing some pitfalls for the unwary. It is intended to be a “stable” language in sense the implementors are committed to supporting Haskell 98 exactly as specified, for the foreseeable future.

The original Haskell Report covered only the language, together with a standard library called the Prelude. By the time Haskell 98 was stabilised, it had become clear that many programs need access to a larger set of library functions (notably concerning input/output and simple interaction with the operating system). If these program were to be portable, a set of libraries would have to be standardised too. A separate effort was therefore begun by a distinct (but overlapping) committee to fix the Haskell 98 Libraries.

The Haskell 98 Language and Library Reports were published in February 1999.

Revising the Haskell 98 Reports

After a year or two, many typographical errors and infelicities had been spotted. I took on the role of gathering and acting on these corrections, with the following goals:

Correct typographical errors.

Clarify obscure passages.

Resolve ambiguities.

With reluctance, make small changes to make the overall language more consistent.

This task turned out to be much, much larger than I had anticipated. As Haskell becomes more widely used, the Report has been scrutinised by more and more people, and I have adopted hundreds of (mostly small) changes as a result of their feedback. The original committees ceased to exist when the original Haskell 98 Reports were published, so every change was instead proposed to the entire Haskell mailing list.

This document is the outcome of this process of refinement. It includes both the Haskell 98 Lan- guage Report and the Libraries Report, and constitutes the official specification of both. It is not a

(11)

PREFACE ix tutorial on programming in Haskell such as the ‘Gentle Introduction’ [6], and some familiarity with functional languages is assumed.

The entire text of both Reports is available online (see “Haskell resources” below).

Extensions to Haskell 98

Haskell continues to evolve, going well beyond Haskell 98. For example, at the time of writing there are Haskell implementations that support:

Syntactic sugar, including:

pattern guards;

recursive do-notation;

lexically scoped type variables;

meta-programming facilities;

Type system innovations, including:

multi-parameter type classes;

functional dependencies;

existential types;

local universal polymorphism and arbitrary rank-types;

Control extensions, including:

monadic state;

exceptions;

concurrency;

There is more besides. Haskell 98 does not impede these developments. Instead, it provides a stable point of reference, so that those who wish to write text books, or use Haskell for teaching, can do so in the knowledge that Haskell 98 will continue to exist.

Haskell Resources

The Haskell web site

http://haskell.org gives access to many useful resources, including:

(12)

x PREFACE Online versions of the language and library definitions, including a complete list of all the differences between Haskell 98 as published in February 1999 and this revised version.

Tutorial material on Haskell.

Details of the Haskell mailing list.

Implementations of Haskell.

Contributed Haskell tools and libraries.

Applications of Haskell.

You are welcome to comment on, suggest improvements to, and criticise the language or its presen- tation in the report, via the Haskell mailing list.

Building the language

Haskell was created, and continues to be sustained, by an active community of researchers and application programmers. Those who served on the Language and Library committees, in particular, devoted a huge amount of time and energy to the language. Here they are, with their affiliation(s) for the relevant period:

Arvind (MIT)

Lennart Augustsson (Chalmers University) Dave Barton (Mitre Corp)

Brian Boutel (Victoria University of Wellington) Warren Burton (Simon Fraser University)

Jon Fairbairn (University of Cambridge) Joseph Fasel (Los Alamos National Laboratory)

Andy Gordon (University of Cambridge) Maria Guzman (Yale University) Kevin Hammond (Uniiversity of Glasgow)

Ralf Hinze (University of Bonn) Paul Hudak [editor] (Yale University)

John Hughes [editor] (University of Glasgow; Chalmers University) Thomas Johnsson (Chalmers University)

Mark Jones (Yale University, University of Nottingham, Oregon Graduate Institute) Dick Kieburtz (Oregon Graduate Institute)

John Launchbury (University of Glasgow; Oregon Graduate Institute) Erik Meijer (Utrecht University)

Rishiyur Nikhil (MIT) John Peterson (Yale University)

Simon Peyton Jones [editor] (University of Glasgow; Microsoft Research Ltd)

(13)

PREFACE xi Mike Reeve (Imperial College)

Alastair Reid (University of Glasgow) Colin Runciman (University of York) Philip Wadler [editor] (University of Glasgow)

David Wise (Indiana University) Jonathan Young (Yale University)

Those marked [editor] served as the co-ordinating editor for one or more revisions of the language.

In addition, dozens of other people made helpful contributions, some small but many substan- tial. They are as follows: Kris Aerts, Hans Aberg, Sten Anderson, Richard Bird, Stephen Blott, Tom Blenko, Duke Briscoe, Paul Callaghan, Magnus Carlsson, Mark Carroll, Manuel Chakravarty, Franklin Chen, Olaf Chitil, Chris Clack, Guy Cousineau, Tony Davie, Craig Dickson, Chris Dor- nan, Laura Dutton, Chris Fasel, Pat Fasel, Sigbjorn Finne, Michael Fryers, Andy Gill, Mike Gunter, Cordy Hall, Mark Hall, Thomas Hallgren, Matt Harden, Klemens Hemm, Fergus Henderson, Dean Herington, Ralf Hinze, Bob Hiromoto, Nic Holt, Ian Holyer, Randy Hudson, Alexander Jacobson, Patrik Jansson, Robert Jeschofnik, Orjan Johansen, Simon B. Jones, Stef Joosten, Mike Joy, Ste- fan Kahrs, Antti-Juhani Kaijanaho, Jerzy Karczmarczuk, Wolfram Kahl, Kent Karlsson, Richard Kelsey, Siau-Cheng Khoo, Amir Kishon, Feliks Kluzniak, Jan Kort, Marcin Kowalczyk, Jose Labra, Jeff Lewis, Mark Lillibridge, Bjorn Lisper, Sandra Loosemore, Pablo Lopez, Olaf Lubeck, Ian Lynagh, Christian Maeder, Ketil Malde, Simon Marlow, Michael Marte, Jim Mattson, John Meacham, Sergey Mechveliani, Gary Memovich, Randy Michelsen, Rick Mohr, Andy Moran, Graeme Moss, Henrik Nilsson, Arthur Norman, Nick North, Chris Okasaki, Bjarte M. Østvold, Paul Otto, Sven Panne, Dave Parrott, Ross Paterson, Larne Pekowsky, Rinus Plasmeijer, Ian Poole, Stephen Price, John Robson, Andreas Rossberg, George Russell, Patrick Sansom, Michael Schnei- der, Felix Schroeter, Julian Seward, Nimish Shah, Christian Sievers, Libor Skarvada, Jan Skib- inski, Lauren Smith, Raman Sundaresh, Josef Svenningsson, Ken Takusagawa, Satish Thatte, Si- mon Thompson, Tom Thomson, Tommy Thorn, Dylan Thurston, Mike Thyer, Mark Tullsen, David Tweed, Pradeep Varma, Malcolm Wallace, Keith Wansbrough, Tony Warnock, Michael Webber, Carl Witty, Stuart Wray, and Bonnie Yantis.

Finally, aside from the important foundational work laid by Church, Rosser, Curry, and others on the lambda calculus, it is right to acknowledge the influence of many noteworthy programming languages developed over the years. Although it is difficult to pinpoint the origin of many ideas, the following languages were particularly influential: Lisp (and its modern-day incarnations Common Lisp and Scheme); Landin’s ISWIM; APL; Backus’s FP [1]; ML and Standard ML; Hope and Hope ; Clean; Id; Gofer; Sisal; and Turner’s series of languages culminating in Miranda1. Without these forerunners Haskell would not have been possible.

1Miranda is a trademark of Research Software Ltd.

(14)

xii PREFACE Simon Peyton Jones

Cambridge, September 2002

(15)

Part I

The Haskell 98 Language

1

(16)
(17)

Chapter 1

Introduction

Haskell is a general purpose, purely functional programming language incorporating many recent innovations in programming language design. Haskell provides higher-order functions, non-strict semantics, static polymorphic typing, user-defined algebraic datatypes, pattern-matching, list com- prehensions, a module system, a monadic I/O system, and a rich set of primitive datatypes, including lists, arrays, arbitrary and fixed precision integers, and floating-point numbers. Haskell is both the culmination and solidification of many years of research on non-strict functional languages.

This report defines the syntax for Haskell programs and an informal abstract semantics for the meaning of such programs. We leave as implementation dependent the ways in which Haskell programs are to be manipulated, interpreted, compiled, etc. This includes such issues as the nature of programming environments and the error messages returned for undefined programs (i.e. programs that formally evaluate to ).

1.1 Program Structure

In this section, we describe the abstract syntactic and semantic structure of Haskell, as well as how it relates to the organization of the rest of the report.

1. At the topmost level a Haskell program is a set of modules, described in Chapter 5. Modules provide a way to control namespaces and to re-use software in large programs.

2. The top level of a module consists of a collection of declarations, of which there are sev- eral kinds, all described in Chapter 4. Declarations define things such as ordinary values, datatypes, type classes, and fixity information.

3. At the next lower level are expressions, described in Chapter 3. An expression denotes a value and has a static type; expressions are at the heart of Haskell programming “in the small.”

4. At the bottom level is Haskell’s lexical structure, defined in Chapter 2. The lexical structure captures the concrete representation of Haskell programs in text files.

3

(18)

4 CHAPTER 1. INTRODUCTION This report proceeds bottom-up with respect to Haskell’s syntactic structure.

The chapters not mentioned above are Chapter 6, which describes the standard built-in datatypes and classes in Haskell, and Chapter 7, which discusses the I/O facility in Haskell (i.e. how Haskell programs communicate with the outside world). Also, there are several chapters describing the Pre- lude, the concrete syntax, literate programming, the specification of derived instances, and pragmas supported by most Haskell compilers.

Examples of Haskell program fragments in running text are given in typewriter font:

let x = 1 z = x+y in z+1

“Holes” in program fragments representing arbitrary pieces of Haskell code are written in italics, as inif then else . Generally the italicized names are mnemonic, such as for expres- sions, for declarations, for types, etc.

1.2 The Haskell Kernel

Haskell has adopted many of the convenient syntactic structures that have become popular in func- tional programming. In this Report, the meaning of such syntactic sugar is given by translation into simpler constructs. If these translations are applied exhaustively, the result is a program written in a small subset of Haskell that we call the Haskell kernel.

Although the kernel is not formally specified, it is essentially a slightly sugared variant of the lambda calculus with a straightforward denotational semantics. The translation of each syntactic structure into the kernel is given as the syntax is introduced. This modular design facilitates reasoning about Haskell programs and provides useful guidelines for implementors of the language.

1.3 Values and Types

An expression evaluates to a value and has a static type. Values and types are not mixed in Has- kell. However, the type system allows user-defined datatypes of various sorts, and permits not only parametric polymorphism (using a traditional Hindley-Milner type structure) but also ad hoc polymorphism, or overloading (using type classes).

Errors in Haskell are semantically equivalent to . Technically, they are not distinguishable from nontermination, so the language includes no mechanism for detecting or acting upon errors. How- ever, implementations will probably try to provide useful information about errors. See Section 3.1.

(19)

1.4. NAMESPACES 5

1.4 Namespaces

There are six kinds of names in Haskell: those for variables and constructors denote values; those for type variables, type constructors, and type classes refer to entities related to the type system;

and module names refer to modules. There are two constraints on naming:

1. Names for variables and type variables are identifiers beginning with lowercase letters or underscore; the other four kinds of names are identifiers beginning with uppercase letters.

2. An identifier must not be used as the name of a type constructor and a class in the same scope.

These are the only constraints; for example, Intmay simultaneously be the name of a module, class, and constructor within a single scope.

(20)

6 CHAPTER 1. INTRODUCTION

(21)

Chapter 2

Lexical Structure

In this chapter, we describe the low-level lexical structure of Haskell. Most of the details may be skipped in a first reading of the report.

2.1 Notational Conventions

These notational conventions are used for presenting syntax:

optional

zero or more repetitions

grouping

choice

difference—elements generated by

except those generated by

fibonacci terminal syntax in typewriter font

Because the syntax in this section describes lexical syntax, all whitespace is expressed explicitly;

there is no implicit space between juxtaposed symbols. BNF-like syntax is used throughout, with productions having the form:

! #" $

!%

!%

'&&&(

!%

)

Care must be taken in distinguishing metalogical syntax such as and &&& from concrete terminal syntax (given in typewriter font) such as | and [...], although usually the context makes the distinction clear.

Haskell uses the Unicode [11] character set. However, source programs are currently biased toward the ASCII character set used in earlier versions of Haskell.

This syntax depends on properties of the Unicode characters as defined by the Unicode consortium.

Haskell compilers are expected to make use of new versions of Unicode as they are made available.

7

(22)

8 CHAPTER 2. LEXICAL STRUCTURE

2.2 Lexical Program Structure

" $ %

"

%

" $

!

!"

! !"

%

!%

!%

%

!%

$

!%

$ ( ) , ; [ ] ` { }

$

$

!" " " "

$ %

% $ #

%

%

"

$ a carriage return

% $ a line feed

$ a vertical tab

!"

$ a form feed

$ a space

$ a horizontal tab

$ any Unicode character defined as whitespace

" " $

! #"$%&(')

! %

$ --

-

" $ {-

%

*

" $ -}

!" " $

!",+.-0/1

!" " 2+.-3/4 5

%

* "

+.-3/4 5 $

+6-0/ 87:9<;>=@?A!'B

)C '%ED

C)'" B

C

'%GFH7:9<;I=4?

+.-3/ $

! $

$ "

!%%

% !"

%

!%

: "

"

!%%

$ J

"

!%%

J "

!%%

_

J

"

!%%

$ a b'&&&( z

J "

!%%

$ any Unicode lowercase letter

%

$ K

K

K

$ A B'&&&( Z

K

$ any uppercase or titlecase Unicode letter

!"

% $ J

!"

% J

!"

%

#"BCML)ND _D :D "D

J

!"

% $ ! # $ % & * + . / < = > ? *O

\ ˆ | - ˜

J

!"

% $ any Unicode symbol or punctuation

P

P

(23)

2.3. COMMENTS 9

P

$ 0 1'&&&( 9

P $ any Unicode decimal digit

$ 0 1'&&&( 7

5 $ A &&& F a '&&&( f

Lexical analysis should use the “maximal munch” rule: at each point, the longest possible lexeme satisfying the

%

5 " production is read. So, although caseis a reserved word, cases is not.

Similarly, although=is reserved,==and˜=are not.

Any kind of

is also a proper delimiter for lexemes.

Characters not in the category+6-0/ are not valid in Haskell programs and should result in a lexing error.

2.3 Comments

Comments are valid whitespace.

An ordinary comment begins with a sequence of two or more consecutive dashes (e.g. --) and extends to the following newline. The sequence of dashes must not form part of a legal lexeme.

For example, “-->” or “|--” do not begin a comment, because both of these are legal lexemes;

however “--foo” does start a comment.

A nested comment begins with “{-” and ends with “-}”. No legal lexeme starts with “{-”; hence, for example, “{---” starts a nested comment despite the trailing dashes.

The comment itself is not lexically analysed. Instead, the first unmatched occurrence of the string

“-}” terminates the nested comment. Nested comments may be nested to any depth: any occurrence of the string “{-” within the nested comment starts a new nested comment, terminated by “-}”.

Within a nested comment, each “{-” is matched by a corresponding occurrence of “-}”.

In an ordinary comment, the character sequences “{-” and “-}” have no special significance, and, in a nested comment, a sequence of dashes has no special significance.

Nested comments are also used for compiler pragmas, as explained in Chapter 11.

If some code is commented out using a nested comment, then any occurrence of{-or-}within a string or within an end-of-line comment in that code will interfere with the nested comments.

2.4 Identifiers and Operators

$

"

!%%

"

!%%

%

B" B B L

(24)

10 CHAPTER 2. LEXICAL STRUCTURE

$ %

"

!%%

%

$ case class data default deriving do else

if import in infix infixl infixr instance

let module newtype of then type where _

An identifier consists of a letter followed by zero or more letters, digits, underscores, and single quotes. Identifiers are lexically distinguished into two namespaces (Section 1.4): those that begin with a lower-case letter (variable identifiers) and those that begin with an upper-case letter (construc- tor identifiers). Identifiers are case sensitive: name,naMe, andNameare three distinct identifiers (the first two are variable identifiers, the last is a constructor identifier).

Underscore, “_”, is treated as a lower-case letter, and can occur wherever a lower-case letter can.

However, “_” all by itself is a reserved identifier, used as wild card in patterns. Compilers that offer warnings for unused identifiers are encouraged to suppress such warnings for identifiers beginning with underscore. This allows programmers to use “_foo” for a parameter that they expect to be unused.

5 " $ " %

" % :( #BM"B:B5' D "

B"

!" $

:

!"

% : #BM" B:B'

$ .. : :: = \ | <- -> @ ˜ =>

Operator symbols are formed from one or more symbol characters, as defined above, and are lexi- cally distinguished into two namespaces (Section 1.4):

An operator symbol starting with a colon is a constructor.

An operator symbol starting with any other character is an ordinary identifier.

Notice that a colon by itself, “:”, is reserved solely for use as the Haskell list constructor; this makes its treatment uniform with other parts of list syntax, such as “[]” and “[a,b]”.

Other than the special syntax for prefix negation, all operators are infix, although each infix operator can be used in a section to yield partially applied operators (see Section 3.5). All of the standard infix operators are just predefined symbols and may be rebound.

In the remainder of the report six different kinds of names will be used:

variables

constructors

$

type variables

! $

!

type constructors

%

$

!

type classes

" $

!

modules

(25)

2.5. NUMERIC LITERALS 11 Variables and type variables are represented by identifiers beginning with small letters, and the other four by identifiers beginning with capitals; also, variables and constructors have infix forms, the other four do not. Namespaces are also discussed in Section 1.4.

A name may optionally be qualified in certain circumstances by prepending them with a module identifier. This applies to variable, constructor, type constructor and type class names, but not type variables or module names. Qualified names are discussed in detail in Chapter 5.

$ "

.<

! $ "

.

!

! $ "

.

!

% $ "

.

%

!" $ "

.<

!"

! " $ "

.

!! !"

Since a qualified name is a lexeme, no spaces are allowed between the qualifier and the name.

Sample lexical analyses are shown below.

This Lexes as this

f.g f . g(three tokens) F.g F.g(qualified ‘g’) f.. f ..(two tokens) F.. F..(qualified ‘.’) F. F .(two tokens)

The qualifier does not change the syntactic treatment of a name; for example, Prelude.+is an infix operator with the same fixity as the definition of+in the Prelude (Section 4.4.2).

2.5 Numeric Literals

" % $

!%

$

5

"

!%

$ 5

$ "

!%

0o

% 0O

!%

0x

" % 0X

" %

$ "

!%

.

"

!%

!

"

!%

!

! $

e E + -

"

!%

(26)

12 CHAPTER 2. LEXICAL STRUCTURE There are two distinct kinds of numeric literals: integer and floating. Integer literals may be given in decimal (the default), octal (prefixed by0oor0O) or hexadecimal notation (prefixed by0xor 0X). Floating literals are always decimal. A floating literal must contain digits both before and after the decimal point; this ensures that a decimal point cannot be mistaken for another use of the dot character. Negative numeric literals are discussed in Section 3.4. The typing of numeric literals is discussed in Section 6.4.1.

2.6 Character and String Literals

$

D \

#

\&

$ "

" D \

#

#

"

$ \

"

!%

o

% x

"

!%

$ a b f n r t v \ " &

( $ ˆ

% NUL SOH STX ETX EOT ENQ ACK

BEL BS HT LF VT FF CR SO SI DLE

DC1 DC2 DC3 DC4 NAK SYN ETB CAN

EM SUB ESC FS GS RS US SP DEL

% $ K

O [ \ ] ˆ _

#

$ \

\

Character literals are written between single quotes, as in’a’, and strings between double quotes, as in"Hello".

Escape codes may be used in characters and strings to represent special characters. Note that a single quote’may be used in a string, but must be escaped in a character; similarly, a double quote"may be used in a character, but must be escaped in a string. \must always be escaped. The category

also includes portable representations for the characters “alert” (\a), “backspace” (\b),

“form feed” (\f), “new line” (\n), “carriage return” (\r), “horizontal tab” (\t), and “vertical tab”

(\v).

Escape characters for the Unicode character set, including control characters such as\ˆX, are also provided. Numeric escapes such as\137are used to designate the character with decimal repre- sentation 137; octal (e.g.\o137) and hexadecimal (e.g.\x37) representations are also allowed.

Consistent with the “maximal munch” rule, numeric escape characters in strings consist of all con- secutive digits and may be of arbitrary length. Similarly, the one ambiguous ASCII escape code,

"\SOH", is parsed as a string of length 1. The escape character\&is provided as a “null character”

to allow strings such as"\137\&9"and"\SO\&H"to be constructed (both of length two). Thus

"\&"is equivalent to""and the character’\&’is disallowed. Further equivalences of characters are defined in Section 6.1.2.

A string may include a “gap”—two backslants enclosing white characters—which is ignored. This allows one to write long strings on more than one line by writing a backslant at the end of one line and at the start of the next. For example,

(27)

2.7. LAYOUT 13

"Here is a backslant \\ as well as \137, \

\a numeric escape character, and \ˆX, a control character."

String literals are actually abbreviations for lists of characters (see Section 3.7).

2.7 Layout

Haskell permits the omission of the braces and semicolons used in several grammar productions, by using layout to convey the same information. This allows both layout-sensitive and layout- insensitive styles of coding, which can be freely mixed within one program. Because layout is not required, Haskell programs can be straightforwardly produced by other programs.

The effect of layout on the meaning of a Haskell program can be completely specified by adding braces and semicolons in places determined by the layout. The meaning of this augmented program is now layout insensitive.

Informally stated, the braces and semicolons are inserted as follows. The layout (or “off-side”) rule takes effect whenever the open brace is omitted after the keywordwhere,let,do, orof. When this happens, the indentation of the next lexeme (whether or not on a new line) is remembered and the omitted open brace is inserted (the whitespace preceding the lexeme may include comments).

For each subsequent line, if it contains only whitespace or is indented more, then the previous item is continued (nothing is inserted); if it is indented the same amount, then a new item begins (a semicolon is inserted); and if it is indented less, then the layout list ends (a close brace is inserted).

If the indentation of the non-brace lexeme immediately following awhere,let,doorofis less than or equal to the current indentation level, then instead of starting a layout, an empty list “{}” is inserted, and layout processing occurs for the current level (i.e. insert a semicolon or close brace).

A close brace is also inserted whenever the syntactic category containing the layout list ends; that is, if an illegal lexeme is encountered at a point where a close brace would be legal, a close brace is inserted. The layout rule matches only those open braces that it has inserted; an explicit open brace must be matched by an explicit close brace. Within these explicit open braces, no layout processing is performed for constructs outside the braces, even if a line is indented to the left of an earlier implicit open brace.

Section 9.3 gives a more precise definition of the layout rules.

Given these rules, a single newline may actually terminate several layout lists. Also, these rules permit:

f x = let a = 1; b = 2 g y = exp2 in exp1

makinga,bandgall part of the same layout list.

As an example, Figure 2.1 shows a (somewhat contrived) module and Figure 2.2 shows the result of applying the layout rule to it. Note in particular: (a) the line beginning }};pop, where the

(28)

14 CHAPTER 2. LEXICAL STRUCTURE module AStack( Stack, push, pop, top, size ) where

data Stack a = Empty

| MkStack a (Stack a) push :: a -> Stack a -> Stack a push x s = MkStack x s

size :: Stack a -> Int

size s = length (stkToLst s) where

stkToLst Empty = []

stkToLst (MkStack x s) = x:xs where xs = stkToLst s pop :: Stack a -> (a, Stack a)

pop (MkStack x s)

= (x, case s of r -> i r where i x = x) -- (pop Empty) is an error top :: Stack a -> a

top (MkStack x s) = x -- (top Empty) is an error

Figure 2.1: A sample program module AStack( Stack, push, pop, top, size ) where {data Stack a = Empty

| MkStack a (Stack a)

;push :: a -> Stack a -> Stack a

;push x s = MkStack x s

;size :: Stack a -> Int

;size s = length (stkToLst s) where

{stkToLst Empty = []

;stkToLst (MkStack x s) = x:xs where {xs = stkToLst s }};pop :: Stack a -> (a, Stack a)

;pop (MkStack x s)

= (x, case s of {r -> i r where {i x = x}}) -- (pop Empty) is an error

;top :: Stack a -> a

;top (MkStack x s) = x -- (top Empty) is an error

}

Figure 2.2: Sample program with layout expanded

termination of the previous line invokes three applications of the layout rule, corresponding to the depth (3) of the nestedwhereclauses, (b) the close braces in thewhereclause nested within the tuple and caseexpression, inserted because the end of the tuple was detected, and (c) the close brace at the very end, inserted because of the column 0 indentation of the end-of-file token.

(29)

Chapter 3

Expressions

In this chapter, we describe the syntax and informal semantics of Haskell expressions, includ- ing their translations into the Haskell kernel, where appropriate. Except in the case of letex- pressions, these translations preserve both the static and dynamic semantics. Free variables and constructors used in these translations always refer to entities defined by thePrelude. For ex- ample, “concatMap” used in the translation of list comprehensions (Section 3.11) means the concatMapdefined by thePrelude, regardless of whether or not the identifier “concatMap”

is in scope where the list comprehension is used, and (if it is in scope) what it is bound to.

In the syntax that follows, there are some families of nonterminals indexed by precedence levels (written as a superscript). Similarly, the nonterminals

,

, and

!

may have a double index: a letter

%

, , or for left-, right- or non-associativity and a precedence level. A precedence- level variable ranges from 0 to 9; an associativity variable

varies over

%

. For example

5

$ (

L

A

LF

)

actually stands for 30 productions, with 10 substitutions for and 3 for

.

$

::

8

! 5 =>

expression type signature

L $

L

A

LF

L

%

5 L

5

L

%

L $ %

5

L 5

L

I

A LF

L

%

$ -

L $

L

A

LF

L 5

L

$ \

&&&

#

) ->

lambda abstraction

let

%

in

let expression

if

then5

else

conditional

case5

of {

!%

( }

case expression 15

(30)

16 CHAPTER 3. EXPRESSIONS

do { " ( }

do expression

5

$ 5

5

function application

5

$

variable

general constructor

%

!%

(

)

parenthesized expression

(

, &&& , 5

)

tuple

[

, &&& , 5

]

list

[

,

..

]

arithmetic sequence

[

|

% , &&& ,

!%

) ]

list comprehension

(

L

A

LF

)

left section

(

%

5

L

A#)LF

)

left section

(

A LF

- 5

L

)

right section

(

A LF

- 5

L

)

right section

{

, &&& ,

) }

labeled construction

C' ) {

, &&& ,

) }

labeled update

Expressions involving infix operators are disambiguated by the operator’s fixity (see Section 4.4.2).

Consecutive unparenthesized operators with the same precedence must both be either left or right associative to avoid a syntax error. Given an unparenthesized expression “3

A

LF

A(& F

”, parentheses must be added around either “3

A

LF ” or “

A& F

” when unless

or

.

Negation is the only prefix operator in Haskell; it has the same precedence as the infix-operator defined in the Prelude (see Section 4.4.2, Figure 4.1).

The grammar is ambiguous regarding the extent of lambda abstractions, let expressions, and condi- tionals. The ambiguity is resolved by the meta-rule that each of these constructs extends as far to the right as possible.

Sample parses are shown below.

This Parses as

f x + g y (f x) + (g y)

- f x + y (- (f x)) + y

let { ... } in x + y let { ... } in (x + y)

z + let { ... } in x + y z + (let { ... } in (x + y))

f x y :: Int (f x y) :: Int

\ x -> a+b :: Int \ x -> ((a+b) :: Int)

A note about parsing. Expressions that involve the interaction of fixities with the let/lambda meta- rule may be hard to parse. For example, the expression

Referințe

DOCUMENTE SIMILARE

In this paper one approximates the Cauchy transform of a complex function on a simple closed curve, using an interpolation cubic spline function given by Iancu (1987)1.

The following question remains opened: will the Montenegrin national identity be empowered, will the number of speakers of Montenegrin language rise and will

Another useful definition is that given by the National Council for Public-Private Partnerships in the U.S., which considers that partnerships are “contractual arrangements between

Actor – method – object, a tripartite unit which in Greenspan’s case can be considered a complete control panel, maybe the most coveted by a professional, Greenspan’s merit seems

In the single-layer neural network, the training process is relatively straightforward because the error (or loss function) can be computed as a direct function of the weights,

The scope is the program region in which definitions (e.g. function definition, class definition, macro definition, type def- inition) with the identifiers introduced (e.g.

Haskell is a general purpose, purely functional programming language exhibiting many of the recent innovations in functional (as well as other) programming language re-

The thread releases ownership of this monitor and waits until another thread notifies threads waiting on this object's monitor to wake up either through a call to the notify method