1 Think Euphoria

1.1 The Way of the Program

Eu) next


1.2 First Draft

This is only the first draft of Think Euphoria.

It is an adaptation of a work by Allen Downey, Think Python: An Introduction to Software Design. See http://www.thinkpython.com

GNU Free Documentation License

Copyright (c) 2009 Tom Ciplijauskas


The goal of this book is to teach you to think like a computer scientist. This way of thinking combines some of the best features of mathematics, engineering, and natural science. Like mathematicians, computer scientists use formal languages to denote ideas (specifically computations). Like engineers, they design things, assembling components into systems and evaluating tradeoffs among alternatives. Like scientists, they observe the behavior of complex systems, form hypotheses, and test predictions.

The single most important skill for a computer scientist is problem solving Problem solving means the ability to formulate problems, think creatively about solutions, and express a solution clearly and accurately. As it turns out, the process of learning to program is an excellent opportunity to practice problem-solving skills. That's why this chapter is called, "The way of the program."

On one level, you will be learning to program, a useful skill by itself. On another level, you will use Euphoria programming as a means to an end. As we go along, that end will become clearer.

1.3 The Euphoria programming language



The programming language you will be learning is Euphoria Euphoria is an example of a high-level language As you might infer from the name "high-level language", there are also "low-level" languages. Low-level computer languages are constrained by the hardware they are designed for. At their highest level, computer languages work the way humans view problems.

Level Example Characteristics
low machine binary code: specific to each CPU
assembly aid to writing machine code
C, C++ fancy way to write assembly code
high Euphoria powerful, simple, fun


Machine instructions look somewhat like this:

 
	00101110 
	10110001 
	00010011 
	00110011 
	10001100 
	... 
	01111101 

Clearly machine-language is difficult to view as a computer language at all; and yet this is the only way to program and operate a computer processor. In order to cope with this nightmare of digits, increasingly higher-level languages were invented. Assembly language looks somewhat like this:

 
	mov eax, 0 
	label: 
	inc eax 
	cmp eax, 50 
	jne label 

Not much better. At least a program called an "assembler" is available to translate from assembly language to machine language. Assembly language is only worth the effort when a critical part of a program must be optimized for speed or memory usage. Operating systems and hardware interfaces are typical applications.

Loosely speaking, computers can only execute programs written in machine language. Thus, programs written in a higher-level language have to be processed before they can run. This extra processing takes some time, which is a small disadvantage of high-level languages.

But the advantages are enormous. First, it is much easier to program in a high-level language. Programs written in a high-level language take less time to write, they are shorter and easier to read, and they are more likely to be correct. Second, high-level languages are portable. Portable programs can run on Windows/DOS, Linux, BSD, and OSX systems with little or no modifications. Machine-level programs can run on only one kind of computer and have to be rewritten to run on another.

Due to these advantages, almost all programs are written in high-level languages. Machine languages are used only for a few specialized applications.

Two kinds of programs process high-level languages into machine language: interpreters and compilers.

An interpreter reads a high-level program and executes it immediately, meaning that it does what the program says. A classical interpreter processes the program (sometimes called script) a little at a time, alternately reading lines and performing computations. This process tends to be inefficient in terms of speed and computer resources, but does give the user of the program an immediate response--interpreters are easy to use. The disadvantage of an interpreter is that it must run every time the program is run.

png/01_Xinterpret.png

A compiler reads the program and converts it completely into a machine-executable binary file. Compilers are able to optimize the code they produce to run faster and or use less memory. In this case, the high-level program is called the source-code, and the translated program is called the object-code or the executable The user must then, as a separate step, run this file in order to have the program execute. Compiling a program can be a real chore and may take some time before an executable is produced. The advantage is that once a program is compiled, you can execute it repeatedly without the need for the compiler.

png/01_compile.png

Euphoria is considered an interpreted language because Euphoria programs are executed directly from source-code. Euphoria actually is an interesting twist on the classical interpreter. When you run a Euphoria program it appears to run immediately. Euphoria is in fact a two step interpreter. First the high-level code is parsed into IL ( Intermediate Language .) Then this IL is interpreted and run. This clever approach lets you run high-level code directly, with the near speed of a compiled program.

png/01_internals.png

You usually run the Euphoria interpreter without concern for how the internals are arranged. However, it is possible to execute the front-end and back-end separately. Executing the front-end is called shrouding because the IL-code file that is produced is no longer in a human readable form. The IL-code file can then be interpreted by the back-end. This is one way to distribute your programs without releasing your source-code.

png/01_shroud.png

There is nothing about a high-level language that restricts it from being either interpreted or compiled. In fact for some languages both are available. In the case of Euphoria, your high level code may be translated into the C language. Then a standard C compiler can be used to produce an executable program. You can do your development work using the Euphoria interpreter and produce a final product using a compiler. (Its like writing a program in C without having to do any actual C coding.)

png/01_Xcompile.png

If your goal is to simply produce a stand-alone program, there is one more option. You can bind a Euphoria program with the interpreter into a single file. A user of a bound program can execute the program without the need for the Euphoria interpreter.

png/01_bind.png

The programming environment represents the fine details of how you actually create and run programs. Creating Euphoria programs is straightforward. First you write your program using any text editor and save it as a file. You then run the interpreter, which reads this file, and immediately executes your program:

You may use any ordinary text editor to create programs for Euphoria.

Euphoria comes with a text editor that features syntax highlighting which adds color to important features of the source code. Examples in this book are syntax colored in the same way.

Euphoria works on many platforms: WIN32, DOS32, LINUX, BSD, OSX. Programs can be written to work on multiple platforms usually with little or no modifications. You may expect text programs to work on all platforms. Graphical programs, when they use a cross platform library (such as Japi, IUP, or wxWigets) will work in both Windows and Linux.

If you use a programming editor , such as the one that comes with Euphoria, you can save and run your program in one step. In other development environments--depending on choice of operating system and editor--the details of executing programs may differ. Also, most programs are more interesting than this one.

1.4 What is a program?



A program is a list of instructions that specify how to perform a computation. The computation might be something mathematical, such as solving a system of equations or finding the roots of a polynomial, but it can also be a symbolic computation, such as searching and replacing text in a document or (strangely enough) interpreting a program.

The details look different in different languages, but a few basic instructions appear in just about every language:

Believe it or not, that's pretty much all there is to it. Every program you've ever used, no matter how complicated, is made up of instructions that look more or less like these. Thus, we can describe programming as the process of breaking a large, complex task into smaller and smaller subtasks until the subtasks are simple enough to be performed with one of these basic instructions.

That may be a little vague, but we will come back to this topic later when we talk about algorithms. An algorithm is an exact set of instructions needed to complete a computing task.

1.5 What is debugging?



Programming is a complex process, and because it is done by human beings, it often leads to errors. For whimsical reasons, a programming error is called a bug , and the process of tracking them down and correcting them is called debugging. Four kinds of errors can occur in a program: syntax errors, runtime errors, hardware errors, and semantic errors. It is useful to distinguish between them in order to track them down more quickly.

1.5.1 Syntax errors



Euphoria can only execute a program if the program is syntactically correct; otherwise, the process fails and returns an syntax error message. Syntax refers to the structure of a program and the rules about that structure. For example, in English, a sentence must begin with a capital letter and end with a period. In Euphoria, parentheses must come in matching pairs, so (1+2) is legal, but 8) is a syntax error . For most readers, a few syntax errors are not a significant problem, which is why we can read the poetry of e. e. cummings without spewing error messages. All computer languages are not so forgiving. If there is a single syntax error anywhere in your program, Euphoria will print an error message and quit, and you will not be able to run your program. During the first few weeks of your programming career, you will probably spend a lot of time tracking down syntax errors. As you gain experience, though, you will make fewer errors and find them faster.

1.5.2 Runtime errors



The second type of error is a runtime error , so called because the error does not appear until you run the program.

Runtime errors are rare in the simple programs you will see in the first few chapters, so it might be a while before you encounter one.

1.5.3 Hardware errors



It comes as a shock to beginning programmers that computers do not always produce the correct arithmetic result. These errors occur not because of any computer language, but because computers themselves have practical limits to the accuracy of calculations. A hardware error is due to the physical limitations of computer hardware. You will not get a warning or error message when such errors occur.

With a typical 32 bit computer you may calculate using about 15 decimal digits of accuracy. This sounds like a lot but is not always enough for some problems.

 
	atom x1, x2, y1, y2, z 
 
	x1 = 10.000000000000004 
	x2 = 10.000000000000000 
	y1 = 10.00000000000004 
	y2 = 10.00000000000000 
 
	z = (y1 - y2 ) / ( x1 - x2 ) 
	? z 
 
		-- the computer responds with 
		-- 11.5 

The exact answer is 10. This problem is the "same" as 40/4 = 10. But, this example needs more than the 15 digits of accuracy available to the computer

Other hardware errors occur because there is a difference between binary arithmetic (what the computer uses) and decimal arithmetic (what you wrote in your program). You will not get any warning when this type of error occurs.

In other cases, seemingly tiny computational errors may accumulate after many calculations are performed. The end result can be a significant error in your computation. However, extremely accurate calculations are possible--such a calculate pi to a million digits--if you use the special techniques of numerical methods developed for these kinds of calculations.

1.5.4 Semantic errors



The fourth type of error is the semantic error If there is a semantic error in your program, it will run successfully, in the sense that the computer will not generate any error messages, but it will not do the right thing. It will do something else. Specifically, it will do what you told it to do.

The problem is that the program you wrote is not the program you wanted to write. The meaning of the program (its semantics) is wrong. Identifying semantic errors can be tricky because it requires you to work backward by looking at the output of the program and trying to figure out what it is doing.

1.5.5 Experimental Debugging



One of the most important skills you will acquire is debugging Although it can be frustrating, debugging is one of the most intellectually rich, challenging, and interesting parts of programming.

In some ways, debugging is like detective work. You are confronted with clues, and you have to infer the processes and events that led to the results you see.

Debugging is also like an experimental science. Once you have an idea what is going wrong, you modify your program and try again. If your hypothesis was correct, then you can predict the result of the modification, and you take a step closer to a working program. If your hypothesis was wrong, you have to come up with a new one. As Sherlock Holmes pointed out, "When you have eliminated the impossible, whatever remains, however improbable, must be the truth." (A. Conan Doyle, The Sign of Four)

For some people, programming and debugging are the same thing. That is, programming is the process of gradually debugging a program until it does what you want. The idea is that you should start with a program that does something and make small modifications, debugging them as you go, so that you always have a working program.

For example, Linux is an operating system that contains thousands of lines of code, but it started out as a simple program Linus Torvalds used to explore the Intel 80386 chip. According to Larry Greenfield, "One of Linus's earlier projects was a program that would switch between printing AAAA and BBBB. This later evolved to Linux." ( The Linux Users' Guide Beta Version 1)

In later chapters I will make more suggestions about debugging and other programming practices.

1.5.6 Formal and natural languages



Natural languages are the languages that people speak, such as English, Spanish, and French. They were not designed by people (although people try to impose some order on them); they evolved naturally.

Formal languages are languages that are designed by people for specific applications. For example, the notation that mathematicians use is a formal language that is particularly good at denoting relationships among numbers and symbols. Chemists use a formal language to represent the chemical structure of molecules. And most importantly:

Programming languages are formal languages that have been designed to express computations.

Formal languages tend to have strict rules about syntax. For example, 3+3=6 is a syntactically correct mathematical statement, but 3 += 3 $ 6 is not. H2O is a syntactically correct chemical name, but 2Zz is not.

Syntax rules come in two flavors, pertaining to tokens and structure. Tokens are the basic elements of the language, such as words, numbers, and chemical elements. One of the problems with 3+= 3$6 is that $ is not a legal token in mathematics (at least as far as we know). Similarly, 2Zz is not legal because there is no element with the abbreviation Zz.

The second type of syntax error pertains to the structure of a command--that is, the way the tokens are arranged. The statement 3=+6$ is structurally illegal because you can't place a plus sign immediately after an equal sign. Similarly, molecular formulas have to have subscripts after the element name, not before.

When you read a sentence in English or a command in a formal language, you have to figure out what the structure of the sentence is (although in a natural language you do this subconsciously). This process is called parsing . For example, when you hear the sentence, "The penny dropped," you understand that "the penny" is the subject and "dropped" is the predicate. Once you have parsed a sentence, you can figure out what it means, or the semantics of the sentence. Assuming that you know what a penny is and what it means to drop, you will understand the general implication of this sentence.

Although formal and natural languages have many features in common--tokens, structure, syntax, and semantics--there are many differences:

People who grow up speaking a natural language--everyone--often have a hard time adjusting to formal languages. In some ways, the difference between formal and natural language is like the difference between poetry and prose, but more so:

Here are some suggestions for reading programs (and other formal languages). First, remember that formal languages are much more dense than natural languages, so it takes longer to read them. Also, the structure is very important, so it is usually not a good idea to read from top to bottom, left to right. Instead, learn to parse the program in your head, identifying the tokens and interpreting the structure. One advantage of Euphoria is that variables and routines must be defined before use; making code easier to read. Finally, the details matter. Little things like spelling errors and bad punctuation, which you can get away with in natural languages, can make a big difference in a formal language.

1.5.7 The first program



Traditionally, the first program written in a new language is called "Hello, World!" because all it does is display the words, "Hello, World!" In Euphoria, it looks like this:

 
	puts(1, "Hello, World!" ) 

You see the result:

 
	Hello, World! 

This is an example of an output command, in this case puts(), which is used to display information in a text format.

The quotation marks in the program mark the beginning and end of the text value you want to display, they don't appear in the result.

Some people judge the quality of a programming language by the simplicity of the "Hello, World!" program. By this standard, Euphoria does well.

1.6 Debugging



It is a good idea to read this book in front of a computer so you can try out the examples as you go.

Whenever you are experimenting with a new feature, you should try to make mistakes. For example, in the "Hello, world!" program, what happens if you leave out one of the quotation marks? What if you leave out both? What if you spell puts wrong?

This kind of experiment helps you remember what you read; it also helps with debugging, because you get to know what the error messages mean. It is better to make mistakes now and on purpose than later and accidentally.

Programming, and especially debugging, sometimes brings out strong emotions. If you are struggling with a difficult bug, you might feel angry, despondent or embarrassed.

There is evidence that people naturally respond to computers as if they were people. When they work well, we think of them as teammates, and when they are obstinate or rude, we respond to them the same way we respond to rude, obstinate people.

Preparing for these reactions might help you deal with them. One approach is to think of the computer as an employee with certain strengths, like speed and precision, and particular weaknesses, like lack of empathy and inability to grasp the big picture.

Your job is to be a good manager: find ways to take advantage of the strengths and mitigate the weaknesses. And find ways to use your emotions to engage with the problem, without letting your reactions interfere with your ability to work effectively.

Learning to debug can be frustrating, but it is a valuable skill that is useful for many activities beyond programming. At the end of each chapter there is a debugging section, like this one, with my thoughts about debugging. I hope they help!

Euphoria is designed to make debugging less painful. When a programming error occurs Euphoria always produces a meaningful, human readable, error message. Low level languages usually just tell you something is wrong.

To determine exactly what a program is doing you may trace each step, one at a time, observing how values change. Euphoria lets you trace the entire program using a full screen display.


next


1.7 Glossary