Sqlite Database Structure and File Format

Mariah | September 11th, 2015 | Forensics

Sqlite has become the favorite database management system of developers since it supports on different platforms such as iOS, Android, etc. Sqlite is an in-process library management, is fast and simple. As the name suggests ‘Lite’, it is a lightweight database and hence applicable for all applications. It has several bindings with the programming languages such as, C, C++, Java, Delphi, etc. and the source code is available for the users from the public domain thus, makes effortful for both private as well as commercial purposes. This session is all about bringing an idea about the structure and the file format of the Sqlite database. Let us start the discussion with the structure of the database.

Sqlite Database Structure

Sqlite File Format

The above figure describes the block diagram showing the Sqlite database components and their relationship.

Let us grab an idea of each components from the below segment.

Interface

The library of Sqlite is implemented using the functions; main.c, legacy.c, and vdbeapi.c. In-order to avoid the collision with other application, Sqlite library starts with Sqlite3.

Tokenizer

Tokenizer breaks downs the strings and separates the token then, passes the tokens to the parser.

Parser

The parser is generated with the Lemon LALR(1) parser generator. Parse assigns meaning to the token.

Code Generator

Code generator generates virtual machine codes for the SQL statements thus, enables to satisfy the request made by the user. There are many files in the generator, which helps to generate the codes such as, build.c, delete.c, attach.c, expr.c, insert.c, pragma.c, select.c, auth.c, etc.

Virtual Machine

The virtual machine under Sqlite database structure runs the code generated by the code generator. It has its own header file called; vdbe.h and the entire machine is stored in single vdbe.c file.

B-Tree

The database follows B-tree structure and different B-trees are meant for index as well as the tables. All these trees are stored under one disk.

Page Cache

Information is seen in fixed chunks and the B-trees module request this data or information. The page cache helps to attain the reading, writing, etc. of the chunks.

OS Interface

To get interfaced with the OS, Sqlite uses abstraction layer and it is defined in os.h. The implementation may vary according to the Operating Systems.

Utilities

The util.c contains the memory allocations and the routines of the string.

All the reads and writes of Sqlite database structure are done to an ordinary disk and since all the transactions follow ACID property, failure or power loss does not affect the transaction processing. Sqlite is a self-contained, serverless database system engine, has a cross-platform file format. The detail of its file format is discussed in the coming up sessions. And that will give you an idea on the file format of Sqlite.

Sqlite File Format

The database is carried in a single file called “main database file” and there is a second file for storing the additional information that may be added during the transaction process called as, ‘rollback journals’ or write-ahead log file. Suppose, suddenly the system crashed then, rollback journal or write-ahead log file that contains critical state information is used to restore the main database file. In such situations, rollback journal are called as, “hot journals or “hot WAL file”; contains necessary information for bringing back the database.

Pages

The database file consists of more than one page and the size of a page depends on the power of two between 512 and 65536. The page numbering starts from 1 and the maximum number is (231 – 2). All the pages in the main database have a single use either;

  • Lock-byte page
  • Freelist page
  • B-tree page
  • Pointer map page
  • Payload overflow page

Remember, before modifying the database page, the unmodified contents will be written to WAL file. The first two pages are only described here. For more details about Sqlite file format, you can refer sqlite.org.

Database Header

The header is comprised in the first 100 bytes.

Header format:

Sqlite Database Structure

Lock-Byte Page

During the study of Sqlite file format, It is single page of database file that contains bytes at offsets between 1073741824 and 1073742335. A database file less than 1073741824 bytes contain no page but, file larger than 1073741824 has one such page. It is used by OS and Sqlite does not use this page.

Freelist Page

Sometimes information may be deleted from the database pages making the page unused. These unused pages are stored in freelist and is reused when additional pages are required, if any.

How are the files stored in the Sqlite database? Which extension is followed? The following session will discuss that.

.db

The primary database file of Sqlite is stored as .db file. Moreover, .db3 file is the primary database file of Sqlite version 3.

.db3-journal

The .db3-journal file is used to roll back the database.

.sqlite

The .sqlite file is created default whenever, a database is created.

.sqlitedb

When the database stores contacts for devices such as iPhone, iPad, etc. the .sqlitedb files are created.

.sqlite-shm

This file is used when Sqlite database works in WAL active mode.

.sqlite-wal

These files are also deployed when Sqlite functions in a WAL active mode. The Write Ahead Log file stores the data generated by Write Ahead options.

There are even other file systems or formats used by the Sqlite still, some of the important ones are mentioned in the above sessions. The evolution of Sqlite database engine has marked a remarkable growth in the development of applications. This database is even used in deploying some software products as well, thus making the favorite database engine for all enterprises.