Euphoria Database System

Euphoria Database System (EDS)

Introduction

Many people have expressed an interest in accessing databases using Euphoria programs. Those people have either wanted to access a name-brand database management system from Euphoria, or they've wanted a simple, easy-to-use, Euphoria-oriented database for storing data. EDS is the latter. It provides a simple, extremely flexible, database system for use by Euphoria programs.

Structure of an EDS database

In EDS, a database is a single file with .edb file type. An EDS database contains 0 or more tables. Each table has a name, and contains 0 or more records. Each record consists of a key part, and a data part. The key can be any Euphoria object - an atom, a sequence, a deeply-nested sequence, whatever. Similarly the data can be any Euphoria object. There are no constraints on the size or structure of the key or data. Within a given table, the keys are all unique. That is, no two records in the same table can have the same key part.

The records of a table are stored in ascending order of key value. An efficient binary search is used when you refer to a record by key. You can also access a record directly, with no search, if you know its current record number within the table. Record numbers are integers from 1 to the length (current number of records) of the table. By incrementing the record number, you can efficiently step through all the records, in order of key. Note however that a record's number can change whenever a new record is inserted, or an existing record is deleted.

The keys and data parts are stored in a compact form, but no accuracy is lost when saving or restoring floating-point numbers or any other Euphoria data.

database.e will work as is, on Windows, DOS, Linux or FreeBSD. EDS database files can be copied and shared between programs running on Linux/FreeBSD and DOS/Windows. Be sure to make an exact byte for byte copy using "binary" mode copying, rather than "text" or "ASCII" mode which could change the line terminators.

 Example:
    database: "mydata.edb"
          first table: "passwords"
               record #1:  key: "jones"   data: "euphor123"
               record #2:  key: "smith"   data: "billgates"
        
          second table: "parts"
               record #1:  key: 134525    data: {"hammer", 15.95, 500}
               record #2:  key: 134526    data: {"saw", 25.95, 100}
               record #3:  key: 134530    data: {"screw driver", 5.50, 1500}

It's up to you to interpret the meaning of the key and data. In keeping with the spirit of Euphoria, you have total flexibility. Unlike most other database systems, an EDS record is not required to have either a fixed number of fields, or fields with a preset maximum length.

In many cases there will not be any natural key value for your records. In those cases you should simply create a meaningless, but unique, integer to be the key. Remember that you can always access the data by record number. It's easy to loop through the records looking for a particular field value.

How to access the data

To reduce the number of parameters that you have to pass, there is a notion of the current database, and current table. Most routines use these current values automatically. You normally start by opening (or creating) a database file, then selecting the table that you want to work with.

You can map a key to a record number using db_find_key(). It uses an efficient binary search. Most of the other record-level routines expect the record number as an argument. You can very quickly access any record, given it's number. You can access all the records by starting at record number 1 and looping through to the record number returned by db_table_size().

How does storage get recycled?

When you delete something, such as a record, the space for that item gets put on a free list, for future use. Adjacent free areas are combined into larger free areas. When more space is needed, and no suitable space is found on the free list, the file will grow in size. Currently there is no automatic way that a file will shrink in size, but you can use db_compress() to completely rewrite a database, removing the unused spaces.

Security / Multi-user Access

This release provides a simple way to lock an entire database to prevent unsafe access by other processes.

Scalability

Internal pointers are 4 bytes. In theory that limits the size of a database file to 4 Gb. In practice, the limit is 2 Gb because of limitations in various C file functions used by Euphoria. Given enough user demand, EDS databases could be expanded well beyond 2 Gb in the future.

The current algorithm allocates 4 bytes of memory per record in the current table. So you'll need at least 4Mb RAM per million records on disk.

The binary search for keys should work reasonably well for large tables.

Inserts and deletes take slightly longer as a table gets larger.

At the low end of the scale, it's possible to create extremely small databases without incurring much disk space overhead.

Disclaimer

Do not store valuable data without a backup. RDS will not be responsible for any damage or data loss.

Database Routines

In the descriptions below, to indicate what kind of object may be passed in and returned, the following prefixes are used:

x	-	a general object (atom or sequence)
s	-	a sequence
a	-	an atom
i	-	an integer
fn	-	an integer used as a file number
st	-	a string sequence, or single-character atom

db_create - create a new database

db_open - open an existing database

db_select - select a database to be the current one

db_close - close a database

db_create_table - create a new table within a database

db_select_table - select a table to be the current one

db_rename_table - rename a table

db_delete_table - delete a table

db_table_list - get a list of all the table names in a database

db_table_size - get the number of records in the current table

db_find_key - quickly find the record with a certain key value

db_record_key - get the key portion of a record

db_record_data - get the data portion of a record

db_insert - insert a new record into the current table

db_delete_record - delete a record from the current table

db_replace_data - replace the data portion of a record

db_compress - compress a database

db_dump - print the contents of a database

db_fatal_id - handle fatal database errors

db_create

Syntax: include database.e
i1 = db_create(s, i2)

Description: Create a new database. A new database will be created in the file with path given by s. i2 indicates the type of lock that will be applied to the file as it is created. i1 is an error code that indicates success or failure. The values for i2 can be either DB_LOCK_NO (no lock) or DB_LOCK_EXCLUSIVE (exclusive lock). i1 is DB_OK if the new database is successfully created. This database becomes the current database to which all other database operations will apply.

Comments: If the path, s, does not end in .edb, it will be added automatically.
If the database already exists, it will not be overwritten. db_create() will return DB_EXISTS_ALREADY.
A version number is stored in the database file so future versions of the database software can recognize the format, and possibly read it and deal with it in some way.

Example:

if db_create("mydata", DB_LOCK_NO) != DB_OK then puts(2, "Couldn't create the database!\n") abort(1) end if

See Also: db_open, db_close

db_open

Syntax: include database.e
i1 = db_open(s, i2)

Description: Open an existing Euphoria database. The file containing the database is given by s. i1 is a return code indicating success or failure. i2 indicates the type of lock that you want to place on the database file while you have it open. This database becomes the current database to which all other database operations will apply.
The return codes are:

global constant DB_OK = 0 -- success DB_OPEN_FAIL = -1 -- couldn't open the file DB_LOCK_FAIL = -3 -- couldn't lock the file in the -- manner requested

Comments: The types of lock that you can use are: DB_LOCK_NO (no lock), DB_LOCK_SHARED (shared lock for read-only access) and DB_LOCK_EXCLUSIVE (for read/write access). DB_LOCK_SHARED is only supported on Linux/FreeBSD. It allows you to read the database, but not write anything to it. If you request DB_LOCK_SHARED on WIN32 or DOS32 it will be treated as if you had asked for DB_LOCK_EXCLUSIVE.
If the lock fails, your program should wait a few seconds and try again. Another process might be currently accessing the database.
DOS programs will typically get a "critical error" message if they try to access a database that is currently locked.

Example:

tries = 0 while 1 do err = db_open("mydata", DB_LOCK_SHARED) if err = DB_OK then exit elsif err = DB_LOCK_FAIL then tries += 1 if tries > 10 then puts(2, "too many tries, giving up\n") abort(1) else sleep(5) end if else puts(2, "Couldn't open the database!\n") abort(1) end if end while

See Also: db_create, db_close

db_select

Syntax: include database.e
i = db_select(s)

Description: Choose a new, already open, database to be the current database. Subsequent database operations will apply to this database. s is the path of the database file as it was originally opened with db_open() or db_create(). i is a return code indicating success (DB_OK) or failure.

Comments: When you create (db_create) or open (db_open) a database, it automatically becomes the current database. Use db_select() when you want to switch back and forth between open databases, perhaps to copy records from one to the other.
After selecting a new database, you should select a table within that database using db_select_table().

Example:

if db_select("employees") != DB_OK then puts(2, "Couldn't select employees database\n") end if

See Also: db_open

db_close

Syntax: include database.e
db_close()

Description: Unlock and close the current database.

Comments: Call this procedure when you are finished with the current database. Any lock will be removed, allowing other processes to access the database file.

See Also: db_open

db_create_table

Syntax: include database.e
i = db_create_table(s)

Description: Create a new table within the current database. The name of the table is given by the sequence of characters, s, and may not be the same as any existing table in the current database.

Comments: The table that you create will initially have 0 records. It becomes the current table.

Example:

if db_create_table("my_new_table") != DB_OK then puts(2, "Couldn't create my_new_table!\n") end if

See Also: db_delete_table

db_select_table

Syntax: include database.e
i = db_select_table(s)

Description: The table with name given by s, becomes the current table. The return code, i, will be DB_OK if the table exists in the current database, otherwise you'll get DB_OPEN_FAIL.

Comments: All record-level database operations apply automatically to the current table.

Example:

if db_select_table("salary") != DB_OK then puts(2, "Couldn't find salary table!\n") abort(1) end if

See Also: db_create_table, db_delete_table

db_rename_table

Syntax: include database.e
db_rename_table(s1, s2)

Description: Rename a table in the current database. The current name of the table is given by s1. The new name of the table is s2.

Comments: The table to be renamed can be the current table, or some other table in the current database. An error will occur if s1 is not the name of a table in the current database, or if s2 is the name of an existing table in the current database.

See Also: db_create_table db_select_table db_delete_table

db_delete_table

Syntax: include database.e
db_delete_table(s)

Description: Delete a table in the current database. The name of the table is given by s.

Comments: All records are deleted and all space used by the table is freed up. If the table is the current table, the current table becomes undefined.
If there is no table with the name given by s, then nothing happens.

See Also: db_create_table db_select_table

db_table_list

Syntax: s = db_table_list()

Description: Return a sequence of all the table names in the current database. Each element of s is a sequence of characters containing the name of a table.

Example:

sequence names names = db_table_list() for i = 1 to length(names) do puts(1, names[i] & '\n') end for

See Also: db_create_table

db_table_size

Syntax: include database.e
i = db_table_size()

Description: Return the current number of records in the current table.

Example:

-- look at all records in the current table for i = 1 to db_table_size() do if db_record_key(i) = 0 then puts(1, "0 key found\n") exit end if end for

See Also: db_select_table

db_find_key

Syntax: include database.e
i = db_find_key(x)

Description: Find the record in the current table with key value x. If found, the record number will be returned. If not found, the record number that key would occupy, if inserted, is returned as a negative number.

Comments: A fast binary search is used to find the key in the current table. The number of comparisons is proportional to the log of the number of records in the table.
You can select a range of records by searching for the first and last key values in the range. If those key values don't exist, you'll at least get a negative value showing where they would be, if they existed. e.g. Suppose you want to know which records have keys greater than "GGG" and less than "MMM". If -5 is returned for key "GGG", it means a record with "GGG" as a key would be inserted as record number 5. -27 for "MMM" means a record with "MMM" as its key would be inserted as record number 27. This quickly tells you that all records, >= 5 and < 27 qualify.

Example:

rec_num = db_find_key("Millennium") if rec_num > 0 then ? db_record_key(rec_num) ? db_record_data(rec_num) else puts(2, "Not found, but if you insert it,\n") printf(2, "it will be #%d\n", -rec_num) end if

See Also: db_record_key, db_record_data, db_insert

db_record_key

Syntax: include database.e
x = db_record_key(i)

Description: Return the key portion of record number i in the current table.

Comments: Each record in a Euphoria database consists of a key portion and a data portion. Each of these can be any Euphoria atom or sequence.

Example:

puts(1, "The 6th record has key value: ") ? db_record_key(6)

See Also: db_record_data

db_record_data

Syntax: include database.e
x = db_record_data(i)

Description: Return the data portion of record number i in the current table.

Comments: Each record in a Euphoria database consists of a key portion and a data portion. Each of these can be any Euphoria atom or sequence.

Example:

puts(1, "The 6th record has data value: ") ? db_record_data(6)

See Also: db_record_key

db_insert

Syntax: include database.e
i = db_insert(x1, x2)

Description: Insert a new record into the current table. The record key is x1 and the record data is x2. Both x1 and x2 can be any Euphoria data objects, atoms or sequences. The return code i1 is DB_OK if the record is inserted.

Comments: Within a table, all keys must be unique. db_insert() will fail with DB_EXISTS_ALREADY if a record already exists with the same key value.

Example:

if db_insert("Smith", {"Peter", 100, 34.5}) != DB_OK then puts(2, "insert failed!\n") end if

See Also: db_find_key, db_record_key, db_record_data

db_delete_record

Syntax: include database.e
db_delete_record(i)

Description: Delete record number i from the current table.

Comments: The record number, i, must be an integer from 1 to the number of records in the current table.

Example:

db_delete_record(55)

See Also: db_insert, db_table_size

db_replace_data

Syntax: include database.e
db_replace_data(i, x)

Description: In the current table, replace the data portion of record number i, with x. x can be any Euphoria atom or sequence.

Comments: The record number, i, must be from 1 to the number of records in the current table.

Example:

db_replace_data(67, {"Peter", 150, 34.5})

See Also: db_delete_record

db_compress

Syntax: include database.e
i = db_compress()

Description: Compress the current database. The current database is copied to a new file such that any blocks of unused space are eliminated. If successful, i will be set to DB_OK, and the new compressed database file will retain the same name. The current table will be undefined. As a backup, the original, uncompressed file will be renamed with an extension of .t0 (or .t1, .t2 ,..., .t99). In the highly unusual case that the compression is unsuccessful, the database will be left unchanged, and no backup will be made.

Comments: When you delete items from a database, you create blocks of free space within the database file. The system keeps track of these blocks and tries to use them for storing new data that you insert. db_compress() will copy the current database without copying these free areas. The size of the database file may therefore be reduced.
If the backup filenames reach .t99 you will have to delete some of them.

Example:

if db_compress() != DB_OK then puts(2, "compress failed!\n") end if

See Also: db_create

db_dump

Syntax: include database.e
db_dump(fn, i)

Description: Print the contents of an already-open Euphoria database. The contents are printed to file or device fn. All records in all tables are shown. If i is non-zero, then a low-level byte-by-byte dump is also shown. The low-level dump will only be meaningful to someone who is familiar with the internal format of a Euphoria database.

Example:

if db_open("mydata", DB_LOCK_SHARED) != DB_OK then puts(2, "Couldn't open the database!\n") abort(1) end if fn = open("db.txt", "w") db_dump(fn, 0)

See Also: db_open

db_fatal_id

Syntax: include database.e
db_fatal_id = i

Description: You can catch certain fatal database errors by installing your own fatal error handler. Simply overwrite the global variable db_fatal_id with the routine id of one of your own procedures. The procedure must take a single argument which is a sequence. When certain errors occur your procedure will be called with an error message string as the argument. Your procedure should end by calling abort().

Example:

procedure my_fatal(sequence msg) puts(2, "A fatal error occurred - " & msg & '\n') abort(1) end procedure db_fatal_id = routine_id("my_fatal")

See Also: db_close