Thursday, December 10, 2009

ASSIGN_02 current trends

1. SQL Azure with ASP Dot Net

SQL Azure (SSDS - SQL Server Data Services) is a cloud database system offered by Microsoft. We interact with the SQL Azure services by either issuing statements to it though a command prompt or developing Dot Net applications. This article will introduce and demonstrate development of SQL Azure ASP Dot Net applications.

2. IDMS
Automated Migration for the IDMS Database

The IDMS database life cycle is coming to the end. Many enterprises are making the decision to migrate their IDMS databases to relational ones: DB2, SQL Server or Oracle. Our database migration solution: a highly automated and efficient tool for IDMS conversion to the industry standard databases - DB2, SQL and Oracle. This solution gives you the benefits of automated migration, while allowing you to control the entire process.

Our solution is for IDMS running on z\OS, z\VSE and VME
Business Benefits

* Dramatically reduces license and maintenance costs
* Makes application enhancement easier
* Provides cost effective migration
* Eliminates the IT resources problem
* Uses proven automated tools, thereby mitigating many of the project risks
* Provides a complete solution for language, database, and platform modernization
* Utilizes BluePhoenix’s extensive experience with modernization projects (over 20 years)

Wednesday, November 18, 2009

ASSIGN_1

A hierarchical data model is a data model in which the data is organized into a tree-like structure. The structure allows repeating information using parent/child relationships: each parent can have many children but each child only has one parent. All attributes of a specific record are listed under an entity type.
In a database, an entity type is the equivalent of a table; each individual record is represented as a row and an attribute as a column. Entity types are related to each other using 1: N mapping, also known as one-to-many relationships.

A relational database matches data by using common characteristics found within the data set. The resulting groups of data are organized and are much easier for people to understand.
For example, a data set containing all the real-estate transactions in a town can be grouped by the year the transaction occurred; or it can be grouped by the sale price of the transaction; or it can be grouped by the buyer's last name; and so on.
Such a grouping uses the relational model (a technical term for this is schema). Hence, such a database is called a "relational database."
The software used to do this grouping is called a relational database management system. The term "relational database" often refers to this type of software.

Wednesday, November 18, 2009

ASSIGN_01

HIERARCHICAL VS RELATIONAL

RELATIONAL

Advantages:
* Is to provide a declarative method for specifying data and queries.
* It directly state what information the database contains and what information we want from it.
* The Database management system software can take care of describing data structures for storing the data and retrieval procedures for getting queries answered.
* It have become a predominant choice for the storage of information in new databases used for financial records, manufacturing and logistical information, personnel data and much more.

Dis-Advantages:
*
As computer power has increased, the inefficiencies of relational databases, which made them impractical in earlier times, have been outweighed by their ease of use.
* They are much less efficient.

HIERARCHICAL

Advantages:
* The user can identify quickly and understand the information that is being stored in the database due to the design of its structure.
* The information is arranged well in a tree-like structure which makes it the database more efficient to read and understand.
* More common to use.

Dis-Advantages:
*
All attributes of a specific record are listed under an entity type and each entity type is equivalent to one table so it is more time consuming compared to Relational.
* You still have to make tables in each parent and children in a structure for you to relate them together.
* It is more complicated rather that Relational which you can understand and much easier to use.

Saturday, August 15, 2009

" MY_IDEA_IS"

PART II

A. Discuss what you have learned and understood about what Relational Database Management System is, so far.

= as what I have understood about Relational Database Management System is, it is about saving or keeping a data that uses a System or Software for proper management of data for future uses. Specially to those person whom to be very reliable to data, this is very important.

= A Relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as introduced by E. F. Codd. Most popular commercial and open source databases currently in use are based on the relational model.

A short definition of an RDBMS may be a DBMS in which data is stored in the form of tables and the relationship among the data is also stored in the form of tables.

B. Define how each of the following fit and function within the framework of relational DBMS systems:

1. Key field= is a field or set of fields of a database (typically a relational database) table which together form a unique identifier for a database record (a table entry). The aggregate of these fields is usually referred to simply as "the key".

2. Database records=it is the records of files or data that uses database management.

3. Data queries=it lets you to put some data or question on the database.

4. Data types=is a set of values and the operations on those values.

5. Data forms=it is the form of saving or keeping data according to the user of it.

6. Table/ Database files=tables is used in database for easy easy manipulation of data and for proper arranging on it.

=is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows. A table has a specified number of columns, but can have any number of rows. Each row is identified by the values appearing in a particular column subset which has been identified as a candidate key.

7. Relationship (Table linkages)= it is use in forming tables, for example, if you have the name of the person as one of your data and you want to put some more information on it so your going to use link. It is somehow like HTML.

Wednesday, July 1, 2009

"MY_ASSIGNMNTS"

Data type


-> A data type (or datatype) in programming languages is a set of values and the operations on those values.

Classes of data types:

1. Abstract Data Type

In computing, an abstract data type (ADT) or abstract data structure is a mathematical model for a certain class of data structures that have similar behavior; or for certain data types of one or more programming languages that have similar semantics. An ADT is defined indirectly, only by the operations that may be performed on it and by mathematical constraints on the effects (and possibly cost) of those operations .

For example, an abstract stack data structure could be defined by two operations: push, that inserts some data item into the structure, and pop, that extracts an item from it; with the constraint that each pop always returns the most recently pushed item that has not been popped yet. When analyzing the efficiency of algorithms that use stacks, one may also specify that both operations take the same time no matter how many items have been pushed into the stack, and that the stack uses a constant amount of storage for each element.

ADTs are purely theoretical entities, used (among other things) to simplify the description of abstract algorithms, to classify and evaluate data structures, and to formally describe the type systems of programming languages. However, an ADT may be implemented by specific data types or data structures, in many ways and in many programming languages; or described in a formal specification language. ADTs are often implemented as modules: the module's interface declares procedures that correspond to the ADT operations, sometimes with comments that describe the constraints. This information hiding strategy allows the implementation of the module to be changed without disturbing the client programs. Abstract data types are also an important conceptual tool in object-oriented programming and design by contract methodologies for software development.

The name "abstract data type" apparently was coined by researches in software engineering and programming language design; while "abstract data structure" was coined by researchers in data structures and algorithms.

2. Algebraic data types

, an algebraic data type (sometimes also called a variant type[1]) is a datatype each of whose values is data from other datatypes wrapped in one of the constructors of the datatype. Any wrapped datum is an argument to the constructor. In contrast to other datatypes, the constructor is not executed and the only way to operate on the data is to unwrap the constructor using pattern matching.

The most common algebraic data type is a list with two constructors: Nil or [] for an empty list, and Cons (an abbreviation of constructor), ::, or : for the combination of a new element with a shorter list (for example (Cons 1 '(2 3 4)) or 1:[2,3,4]).

Special cases of algebraic types are product types i.e. records (only one constructor), sum types or tagged unions (many constructors with a single argument) and enumerated types (many constructors with no arguments). Algebraic types are one kind of composite type (i.e. a type formed by combining other types).

An algebraic data type may also be an abstract data type (ADT) if it is exported from a module without its constructors. Values of such a type can only be manipulated using functions defined in the same module as the type itself.

In set theory the equivalent of an algebraic data type is a discriminated union – a set whose elements consist of a tag (equivalent to a constructor) and an object of a type corresponding to the tag (equivalent to the constructor arguments).

3. Composite data types

In computer science, composite data types are data types which can be constructed in a program using its programming language's primitive data types and other composite types. The act of constructing a composite type is known as composition.

4. Function types

Type signature is a term that is used in computer programming. A type signature defines the inputs and outputs for a function or method. A type signature includes at least the function name and the number of its parameters. In some programming languages, it may also specify the function's return type or the types of its parameters.

5. Machine data types

All data in computers based on digital electronics is represented as bits (alternatives 0 and 1) on the lowest level. The smallest addressable unit of data is a group of bits called a byte (usually an octet, which is 8 bits). The unit processed by machine code instructions is called a word (as of 2008, typically 32 or 64 bits). Most instructions interpret the word as a binary number, such that a 32-bit word can represent unsigned integer values from 0 to 232 − 1 or signed integer values from − 231 to 231 − 1. Because of two's complement, the machine language and machine don't need to distinguish between these unsigned and signed data types for the most part.

There is a specific set of arithmetic instructions that use a different interpretation of the bits in word as a floating-point number.

6. Object types


In computer science, an object commonly means a data structure consisting of data fields and procedures (methods) that can manipulate those fields. Typically, when calling a method from some object, the object itself should be passed as a parameter to the method.

Objects are the foundation of object-oriented programming, and are fundamental data types in object-oriented programming languages. These languages provide extensive syntactic and semantic support for object handling, including a hierarchical type system, special notation for declaring and calling methods, and facilities for hiding selected fields from client programmers. However, objects and object-oriented programming can be implemented in any language.

Objects have proven to be very helpful in software development, particularly for large programs. For one thing, they are a natural way to implement abstract data structures, by "physically" bringing together the data components with the procedures that manipulate them. More importantly, they make it possible to handle very disparate objects by the same piece of code, as long as they all have the proper method. They also improve program reliability, simplify software maintenance, the management of libraries, and the division of work in programmer teams. Object-oriented programming languages are generally designed to exploit and enforce these potential advantages of the object model.

Outside object-oriented programming, the word object may also mean simply any entity that can be manipulated by the commands of a programming language, such as a value (computer science), variable, function, or data structure.

7. Pointer and reference data types


In computer science, a reference is a value that enables a program to directly access the particular data item, such as a variable or a record, in the computer's memory or in some other storage device. The reference is said to refer to the data item, and accessing that data is called dereferencing the reference.

A reference is distinct from the data itself. Typically, a reference is the physical address of where the data is stored in memory or in the storage device. For this reason, a reference is often called a pointer or address, and is said to point to the data. However a reference may also be the offset (difference) between the datum's address and some fixed "base" address, or an index into an array.

The concept of reference must not be confused with other values (keys or identifiers) that uniquely identify the data item, but give access to it only through a non-trivial lookup operation in some table data structure.

References are widely used in programming, especially to efficiently pass large or mutable data as arguments to procedures, or to share such data among various uses. In particular, a reference may point to a variable or record that contains references to other data. This idea is the basis of indirect addressing and of many linked data structures, such as linked lists.

A reference may be compared to a street address, such as "12 Main Street" or "three houses down the road on the left side". Going to the building with that address is analogous to dereferencing the reference. The name "Bob and Joe's Car Shop" might be a unique identifier for the same building, but cannot be compared to a data reference, because finding the building with that name requires a non-trivial search or a lookup in some directory. A reference stored in a data record can be compared to a sign on that shop saying "For tire service please go to 20 Cross Street". Passing a reference to a subroutine, instead of the data, is like giving your friend's the address of that shop, instead of taking Bob and Joe and all their tools to your friend's home.

8. Primitive data types


In computer science, primitive data type can refer to either of the following concepts:[citation needed]

  • a basic type is a data type provided by a programming language as a basic building block. Most languages allow more complicated composite types to be recursively constructed starting from basic types.
  • a built-in type is a data type for which the programming language provides built-in support.

In most programming languages, all basic data types are built-in. In addition, many languages also provide a set of composite data types. Opinions vary as to whether a built-in type that is not basic should be considered "primitive".[citation needed]

Depending on the language and its implementation, primitive data types may or may not have a one-to-one correspondence with objects in the computer's memory. However, one usually expects operations on basic primitive data types to be the fastest language constructs there are.[citation needed] Integer addition, for example, can be performed as a single machine instruction, and some processors offer specific instructions to process sequences of characters with a single instruction. In particular, the C standard mentions that "a 'plain' int object has the natural size suggested by the architecture of the execution environment". This means that int is likely to be 32 bits long on a 32-bit architecture. Basic primitive types are almost always value types.

Most languages do not allow the behavior or capabilities of primitive (either built-in or basic) data types to be modified by programs. Exceptions include Smalltalk, which permits all data types to be extended within a program, adding to the operations that can be performed on them or even redefining the built-in operations.



" 3 DataBase Management System "

A database management system (DBMS) is computer software that manages databases. DBMSes may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way.

A. Microsoft Office Access

-> previously known as Microsoft Access, is a relational database management system from Microsoft that combines the relational Microsoft Jet Database Engine with a graphical user interface and software development tools. It is a member of the Microsoft Office suite of applications and is included in the Professional and higher versions for Windows and also sold separately. There is no version for MacOS or for Microsoft Office Mobile.

Access stores data in its own format based on the Access Jet Database Engine. It can also import or link directly to data stored in other Access databases, Excel, SharePoint lists, text, XML, Outlook, HTML, dBase, Paradox, Lotus 1-2-3, or any ODBC-compliant data container including Microsoft SQL Server, Oracle, MySQL and PostgreSQL. Software developers and data architects can use it to develop application software and non-programmer "power users" can use it to build simple applications. It supports some object-oriented techniques but falls short of being a fully object-oriented development tool.

Microsoft Access is part of the Microsoft Office suite and is the most popular Windows desktop database application. It is targeted for the information worker market, and is the natural progression for managing data when the need for a relational database arises or after reaching the limits of Microsoft Excel.


B. Linter SQL RDBMS

->is the main product of RELEX Group. Linter is a Russian DBMS compliant with the SQL-92 standard and supporting the majority of operating systems, among them Win32 (including WinCE), NetWare, various versions of Unix, OS9, QNX, VxWorks and others. The system enables transparent interaction between the client applications and the database server functioning in different hardware and software environments. DBMS Linter includes program interfaces for the majority of popular development tools. The system provides a high data security level allowing the user to work with secret information. Linter is the only DBMS certified by FSTEC of Russia as compliant with Class 2 data security requirements and Level 2 of undeclared feature absence control. For more than ten years, Linter has been used by Russian Ministry of Defense, Ministry of Foreign Affairs and other government bodies.


C. Visual FoxPro
-> is a data-centric object-oriented and procedural programming language produced by Microsoft. It is derived from FoxPro (originally known as FoxBASE) which was developed by Fox Software beginning in 1984. Fox Technologies merged with Microsoft in 1992, after which the software acquired further features and the prefix "Visual". The last version of FoxPro (2.6) worked under Mac OS, DOS, Windows, and Unix: Visual FoxPro 3.0, the first "Visual" version, dropped the platform support to only Mac and Windows, and later versions were Windows-only. The current version of Visual FoxPro is COM-based and Microsoft has stated that they do not intend to create a Microsoft .NET version.

FoxPro originated as a member of the class of languages commonly referred to as "xBase" languages, which have syntax based on the dBase programming language. Other members of the xBase language family include Clipper and Recital. (A history of the early years of xBase can be found in the dBase entry.)

Visual FoxPro, commonly abbreviated as VFP, is tightly integrated with its own relational database engine, which extends FoxPro's xBase capabilities to support SQL query and data manipulation. Unlike most database management systems, Visual FoxPro is a full-featured, dynamic programming language that does not require the use of an additional general-purpose programming environment. It can be used to write not just traditional "fat client" applications, but also middleware and web applications.

Friday, June 26, 2009

Memory Variable vs. Data Field

Memory Variable

The memory of a computer is organized into a regular pattern of containers for information. These containers for information are called "words". Each one has a numeric address and each one is the same size as each of the others. For most applications, it is inconvenient to refer to portions of memory by their numeric addresses, so programming languages allow us to allocate portions of memory by name. When we store information in the memory of a computer we need to decide on how much we need for various purposes and on how it will be organized. Programming languages provide mechanism for "types" of information in memory. They also provide mechanisms to identify repetitive arrays of items of the same type and to aggregate possibly heterogeneous types under a common name.

Portion of word

Number of bits

nibble

4

character

8 or 16

byte

8

octet

8

half-word

16

word

32

The ambiguity on the size of characters is because we are in a time of transition to richer and more complex character sets for which the limit of 256 characters imposed by an 8-bit character size is inappropriate. We may also be starting on a transition to 64-bit words. Time will tell.


Data Field

A data field is a place where you can store data. Commonly used to refer to a column in a database or a field in a data entry form or web form.

The field may contain data to be entered as well as data to be displayed.

Monday, June 22, 2009

"TERMCONTRAST"

A. INFORMATION VS. DATA

Information as a concept has a diversity of meanings, from everyday usage to technical settings. Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation. It is also the facts that was received and understood by. While Data are pieces of information that represent the qualitative or quantitative attributes of a variable or set of variables. Data (plural of "datum") are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which information and knowledge are derived. It is the fact that can be drawn.

B. DATA STORAGE VS. COMPUTER STORAGE

Actually Data Storage is just part of Computer storage because Data Storage is just part of the memory of the computer where data was saved.Computer data storage, often called storage or memory, refers to computer components, devices, and recording media that retain digital data used for computing for some interval of time. Computer data storage provides one of the core functions of the modern computer, that of information retention. It is one of the fundamental components of all modern computers, and coupled with a central processing unit (CPU, a processor), implements the basic computer model used since the 1940s. While Computer Storage is the memory that where documents are saved like for example Hard Disks.

C. OPERATING SYSTEM VS. COMPUTER SYSTEM

Operating System is a system that is tasks to manipulate computer programs. Operating system (commonly abbreviated to either OS or O/S) is an interface between hardware and user; it is responsible for the management and coordination of activities and the sharing of the resources of the computer. The operating system acts as a host for computing applications that are run on the machine. As a host, one of the purposes of an operating system is to handle the details of the operation of the hardware. This relieves application programs from having to manage these details and makes it easier to write applications. Almost all computers (including handheld computers, desktop computers, supercomputers, video game consoles) as well as some robots, domestic appliances (dishwashers, washing machines), and portable media players use an operating system of some type. [1] Some of the oldest models may however use an embedded operating system, that may be contained on a compact disk or other data storage device. While a Computer System is a complete, working computer. The computer system includes not only the computer, but also any software and peripheral devices that are necessary to make the computer function. Every computer system, for example, requires an operating system. It can also manipulate many computers like computer station.

Sunday, March 8, 2009

Sorts

A.)
Definition: Data Structure

= is the process of keeping data into a computer system. It is eventually made from keeping data or finding data up to the process of storing. The first thing to do is to look for the data that you’re going to keep then after you have that data, the next thing to do is store it in the computer system. You must encode the data or information that is very important then save it into your computer. The purpose of data structure is how to preserve a data into the safest receptacle which is computer system. Personal Definition:

= in computer science is a way of storing data in a computer so that it can be used efficiently. http://en.wikipedia.org/wiki/Data_structure

= an organization of mathematical and logical concepts of data. Often a carefully chosen data structure will allow the most efficient algorithm to be used. -http://en.wiktionary.org/wiki/data_structure

= can be viewed as an interface between two functions or as an implementation of methods to access storage that is organized according to the associated data type. http://www.cplusplus.com/doc/tutorial/structures.html

= are implemented by a programming language as data types and the references and operations they provide. http://en.wikipedia.org/wiki/Data_structure

= a means of storing a collection of data. -http://www.cs.ucc.ie/~dgb/courses/swd/glossary.html

= allows a variety of critical operations to be performed, using as few resources, both execution time and memory space, as possible. http://en.wikipedia.org/wiki/Data_structure

= the study of methods for effectively using a computer to solve problems, or in other words, determining exactly the problem to be solved. http://www.cplusplus.com/doc/tutorial/structures.html

= are suited to different kinds of applications, and some are highly specialized to certain tasks. -http://www.itl.nist.gov/div897/sqg/dads/HTML/datastructur.html

= is a group of data elements grouped together under one name. These data elements, known as members, can have different types and different lengths. Data structures are declared in C++ using the following syntax. -http://en.wikipedia.org/wiki/Data_structure

= are a featured that can be used to represent databases, especially if we consider the possibility of building arrays of them. -http://www.itl.nist.gov/div897/sqg/dads/HTML/datastructur.html





B.)
1. Deque Data Structure
- n computer science theory, a deque (short for double-ended queue—usually pronounced deck) is an abstract list type data structure, also called a head-tail linked list, for which elements can only be added to or removed from the front (head) or back (tail).

Structure
Deque is sometimes written dequeue, but this use is generally deprecated in technical literature or technical writing because dequeue is also a verb meaning "to remove from a queue". Nevertheless, several libraries and some writers, such as Aho, Hopcroft, and Ullman in their textbook Data Structures and Algorithms, spell it dequeue. DEQ and DQ are also used.

Implementations

There are at least two common ways to efficiently implement a deque: with a modified dynamic array or with a doubly-linked list. For information about doubly-linked lists, see the linked list article.

Dynamic array implementation

Deques are often implemented using a variant of a dynamic array that can grow from both ends, sometimes called array deques. These array deques have all the properties of a dynamic array, such as constant time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant time insertion/removal at both ends, instead of just one end. Two common implementations include:

  • Storing deque contents in a circular buffer, and only resizing when the buffer becomes completely full. This decreases the frequency of resizings, but requires an expensive branch instruction for indexing.
  • Allocating deque contents from the center of the underlying array, and resizing the underlying array when either end is reached. This approach may require more frequent resizings and waste more space, particularly when elements are only inserted at one end.

Language support

C++'s Standard Template Library provides the templated classes std::deque and std::list, for the dynamic array and linked list implementations, respectively.

As of Java 6, Java's Collections Framework provides a new Deque interface that provides the functionality of insertion and removal at both ends. It is implemented by classes such as ArrayDeque (also new in Java 6) and LinkedList, providing the dynamic array and linked list implementations, respectively.

Python 2.4 introduced the collections module with support for deque objects.

2. Heap Data Structure

THE STRUCTURE
-In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) ≥ key(B). This implies that an element with the greatest key is always in the root node, and so such a heap is sometimes called a max heap. (Alternatively, if the comparison is reversed, the smallest element is always in the root node, which results in a min heap.) This is why heaps are used to implement priority queues. The efficiency of heap operations is crucial in several graph algorithms.

OPERATION TO GET CONTENTS ON THE NODES:

The operations commonly performed with a heap are

  • delete-max or delete-min: removing the root node of a max- or min-heap, respectively
  • increase-key or decrease-key: updating a key within a max- or min-heap, respectively
  • insert: adding a new key to the heap
  • merge: joining two heaps to form a valid new heap containing all the elements of both.
Comparison of theoretic bounds for variants

The following complexities[1] are worst-case for binary and binomial heaps and amortized complexity for Fibonacci heap. O(f) gives asymptotic upper bound and Θ(f) is asymptotically tight bound (see Big O notation). Function names assume a min-heap.

Operation

Binary

Binomial

Fibonacci

createHeap

Θ(1)

Θ(1)

Θ(1)

findMin

Θ(1)

Θ(lg n) or Θ(1)

Θ(1)

deleteMin

Θ(lg n)

Θlg n)

O(lg n)

insert

Θ(lg n)

O(lg n)

Θ(1)

decreaseKey

Θ(lg n)

Θ(lg n)

Θ(1)

merge

Θ(n)

O(lg n)

Θ(1)

For pairing heaps the insert and merge operations are conjectured[citation needed] to be O(1) amortized complexity but this has not yet been proven. decreaseKey is not O(1) amortized complexity [1] [2]

HEAP RETRIEVAL

Heaps are a favorite data structures for many applications.

Interestingly, full and almost full binary heaps may be represented using an array alone. The first (or last) element will contain the root. The next two elements of the array contain its children. The next four contain the four children of the two child nodes, etc. Thus the children of the node at position n would be at positions 2n and 2n+1 in a one-based array, or 2n+1 and 2n+2 in a zero-based array. Balancing a heap is done by swapping elements which are out of order. As we can build a heap from an array without requiring extra memory (for the nodes, for example), heapsort can be used to sort an array in-place.

One more advantage of heaps over trees in some applications is that construction of heaps can be done in linear time using Tarjan's algorithm.

HEAP IMPLEMENTATION

  • The C++ Standard Template Library provides the make_heap, push_heap and pop_heap algorithms for binary heaps, which operate on arbitrary random access iterators. It treats the iterators as a reference to an array, and uses the array-to-heap conversion detailed above.

3. VLIST DATA STRUCTURES

In computer science, the VList is a persistent data structure designed by Phil Bagwell in 2002 that combines the fast indexing of arrays with the easy extension of cons-based (or singly-linked) linked lists.[1]

OPERATIONS TO ENTER ELEMENTS(DATA)

The primary operations of a VList are:

  • Locate the kth element (O(1) average, O(log n) worst-case)
  • Add an element to the front of the VList (O(1) average, with an occasional allocation)
  • Obtain a new array beginning at the second element of an old array (O(1))
  • Compute the length of the list (O(log n))

STRUCTURE

The underlying structure of a VList can be seen as a singly-linked list of arrays whose sizes decrease geometrically; in its simplest form, the first contains the first half of the elements in the list, the next the first half of the remainder, and so on. Each of these blocks stores some information such as its size and a pointer to the next.

The average constant-time indexing operation comes directly from this structure; given a random valid index, we simply observe the size of the blocks and follow pointers until we reach the one it should be in.

DATA RETRIEVAL

Any particular reference to a VList is actually a <base, offset> pair indicating the position of its first element in the data structure described above. The base part indicates which of the arrays its first element falls in, while the offset part indicates its index in that array. This makes it easy to "remove" an element from the front of the list; we simply increase the offset, or increase the base and set the offset to zero if the offset goes out of range. If a particular reference is the last to leave a block, the block will be garbage-collected if such facilities are available, or otherwise must be freed explicitly.

Because the lists are constructed incrementally, the first array in the array list may not contain twice as many values as the next one, although the rest do; this does not significantly impact indexing performance. We nevertheless allocate this much space for the first array, so that if we add more elements to the front of the list in the future we can simply add them to this list and update the size. If the array fills up, we create a new array, twice as large again as this one, and link it to the old first array.