Formatting page ...
1. The overall database design is known as a schema and the collection of information stored in database at a particular instant is an instance . schemas are changed very rarely and if at all, by database administrator alone . A database can have one physical schema , one logical schema and several sub schemas 2. data independence ----- the ability to modify schema definition in one level without effecting the schema definition in next higher level of database is data independence . Physical data independence is ability to modify physical schema without causing application programs be rewritten, whereas logical data independence is ability to modify logical schema of database without causing application programs be rewritten obviously logical data independence is more difficult to achieve as application programs are heavily dependent on logical structure of data they access 3 a transaction is a unit of programs execution that accesses and possibly updates various data items , ACID properties of a transaction are as follows ---> Atomicity ------ either transaction executes in it s entirety or none of it executes at all > Consistency ----- the state of the database remains the same both before and after transaction occurs , despite system changes > Isolation ----- despite the simultaneous execution of transactions, each transaction proceeds independently of the other despite systems changes > Durability --- any modifications made to the database as a result of transaction persist even if there are system changes we now explain the ACID properties of a transaction through examples , we have a schedule involving accounts A and B .the initial values in A and B accounts are $ 1000 and $ 2000 , now this is consistent state as sum A+B is preserved throughout , now a transaction occurs which transfers $ 50 from A to B , as a result final values in A and B are $ 950 and $ 2050 , this is clearly a consistent state as sum A+B is preserved both before and after transaction occurs , we now suppose that a system failure occurs after write[A] committed but before write[B] commits so now the final values in A and B are 950 and 2000 clearly inconsistent state as sum A+B is no longer preserved both before and after transaction, even if systems failure did not occurs and transaction proceeded to completion, such an inconsistent state would not have been visible except during execution of transaction, this once agai.n reflects atomicity property I.e either the transaction executes in it s entirety or none executes at all. Isolation is preserved since despite concurrent execution of transactions, each transaction proceeds independently of others despite systems changes , durability is preserved through recovery management component of transactions 4 . query processor ---- database users can be application programers , sophisticated users , specialized users and naive users . Sophisticated users are software personnel who interact with database through writing their queries as sql statements that is forwarded to query processor who then submits it to query manager 5. in the view level of data abstraction, there are entity rel,. Diagrams or ERDs that consist of Entities and their defining attributes entity is like object , encapsulates data abstraction, attribute inheritance , polymorphism . Entity can be weak entity or strong entity . A weak entity set is one whose attributes do not form a primary key a strong entity set is one whose attributes form a primary key attributes are the characteristics of entity attributes can be of following types ---> simple and composite ----- simple attributes cannot be more subdivided whereas composite attributes can be further subdivided > single valued and multi valued attributes ---- single valued attribute has only one value for entity in entity set whereas multi valued attribute has more than one values for entity in entity set > null attribute ---- a NULL value is returned where entity specifies attribute with no values > derived attributes ---- obtained from other existing attributes or entity sets entities in ERDs go into relation sets . Some of extended ER properties are specialization, generalization and aggregation, due to aggregation that does not support nested queries I.n RDBMS not possible to express relations amongst relations 6 types of SQL commands ---> data definition language ---- includes CREATE, DELETE etc > data manipulation language or DML ---- can be procedural or non procedural DML, Procedural DML states what data is stored and what relationships exist amongst stored data without going into physical implementation of same , whereas non procedural DML states what data is stored , what relationships exist amongst stored data and how data is physically stored in database . Examples are INSERT, UPDATE, DROP a relation or table etc > data control language --- DCL commands as GRANT , REVOKE etc regarding authorization to database 7 the removal of data repetition or data redundancy from existing tables or rels is Normalization and accordingly various normal forms of a relation schema as 1NF , 2NF , BCNF, 3NF, 4NF , 5NF etc while 1NF has redundant data that is removed to some extent in 2NF . Still higher normal form is BCNF that defines super keys . We have a rel schema r where for the set F of functional dependencies holding in r , we can find A belongs to r, B belongs to r , such that A-- B is a trivial dependency in F+ such that at least one of the following is true ---> A-- B is a trivial dependency in r > A is a super key of r every rel schema in 3NF is obviously in BCNF but the converse is not always true still higher normal form is 3NF that defines candidate key values and transitive dependencies, we have a rel schema r where for the set F of functional dependencies holding good here, we can find A belonging to r , B belonging to r , such that A -- B is a trivial dependency in F+, such that at least one of the following is true ---> A -- B is a trivial dependency in r > A is a super key of r > it is possible to find an extraneous attribute in [B-A] that is contained I.n a candidate key for r thus 3 NF defines candidate keys and hence transitive dependencies still higher normal form is 4NF that defines multi valued dependencies in case of 5NF or project join normal form , defines join dependencies , we have a rel schema r that is divided into subsets r1 , r2 , .. such that r = r1ur2u . . then *[ r1 r2 . ] is a join dependency in r , if further, each of r s is a super key of r , then rel schema r is in 5NF or project join normal form 8 functional dependency can be of following types -------- full functional dependency ---- all non key attributes are fully dependent on key attribute ----- partial functional dependency ----- some non key attributes are dependent on key attribute and remaining non key attributes are de;pendent on key attribute and on some other non key attribute ------ transitive dependency ---- if the attributes in key column of one relation correspond to key values in column of another relation this type of functional dependency is transitive dependency ---- multi valued dependency ----- encountered in 4NF ----- join dependency ---- encountered in 5NF , 9 Different types of lock --- a transaction must essentially lock all it s data items before it starts execution ,locking can be shared mode or exclusive mode lock shared mode lock means that a transaction can read as well as write data items exclusive mode lock means that a transaction can just read , not write the data items 10 where a database is to be recovered from hard disk crash we use the <check pointing> method of log based recovery , a <check point> is a special type of log record , the main steps in this recovery process using checkpoints is as follows ---> output onto stable storage all log records currently residing in main memory > output all modified buffer blocks onto disk > output onto stable storage a log record < check point> > no database modifications can occur where a < check point> in progress we now suppose that a hard disk crash occurs while a transaction is executing , the log is consulted backwards till it identifies the < transaction commit> record , the log is still consulted backwards till it finds the < check point> record and is more consulted backwards till finds the <transaction start> record, that which identifies the transaction now undo or redo operations can be performed on this transaction and any transactions executing after , for concurrent transaction processing, fuzzy check pointing is used 11 database systems versus file systems ------> in file processing data is not integrated but kept separately in file storage whereas in DBMS data is integrated and stored in a centralized data servers > data redundancy is removed through normalization in DBMS whereas data redundancy and normalization is absent in file processing > database provides multiple views of data to different users and unlike in file processing > database provides multiple user access to data unlike in file processing 12 the main idea in RDBMS is data abstraction , that is , providing users with abstract view of data . Accordingly views of data defined at physical level, logical level and view level as accordingly schemas are defined at lowest level of database design there is physical schema or physical view that describes how the data is actually physically stored in database without going into logical implementation of same at next higher level of database design is logical view or logical schema that defines what data is stored and what relationships exist amongst stored data without going into physical implementation of same at highest level of database design there is view level or subschemas that show part of database to users . A database can have one physical schema , one logical schema and several subschemas . 13 types of data models ----> defined at the highest level of data abstraction, view level ---- can be ERDs or object oriented databases . ERDs consist of entities, attributes and rel. sets . Some of extended ER properties are specialization, generalization and aggregation, due to aggregation possible to express rels amongst rels in RDBMS and aggregation supports nested queries , in case of object oriented databases there are 2 levels of data abstraction > defined at next higher level of data abstraction, logical level --- can be relational tables , network, hierarchical databases . Relational tables consist of rows and columns and identified with keys ,.network databases consist of linked lists and arbitrary graph representation is possible . Whereas hierarchical databases consist of linked lists and arbitrary graph representation is not possible , just tree representation is possible . consists of virtual records , where a virtual record has no data items but includes a logical pointer to 1st physical record . Where data is to be replicated in several database trees we keep a copy of that record in one of the trees and replace every other copies with a virtual record , including a logical pointer to 1st physical record > defined at lowest level of data abstraction, physical level ---- examples are frame memory model, UML database etc 14 the main idea in RDBMS is data abstraction or to provide users with abstract view of data . Accordingly views of database can be defined at physical level, logical level and view level. Views in SQL are in fact temporary tables describing part of database to users > physical view ---- at lowest level of data abstraction, describes how the data is actually physically stored in database without going I.nto logical implementation > logical view ----- at next higher level of data abstraction, logical view that describes what data and what relationships exist amongst stored data without going into physical implementation of same > at highest level of data abstraction, view level ---- describes part of database to users , also called as subschemas . A database can have one physical schema , one logical schema and several subschemas 15 different types of database users ---> application programmers ---- are software programmers who interact with database through high level programming languages as COBOL FORTRAN etc > sophisticated users ----- sw personnel who write database queries as SQL statements and provide it to query processor who then forwards it to query manager > specialized users ---- who interact with database by writing CAD . CAE etc and graphical user interface standards > naive users --- unsophisticated users who interact with database by invoking one or more queries written previously to database 16 the avoidance of accidental loss of data consistency is integrity and accordingly integrity constraints as entity integrity , domain integrity , referential integrity . > entity integrity --- if attributes of a table are prime attributes, they cannot have null values . Thus values of primary key attributes cannot be NULL > Domain integrity ---- A domain is a set of atomic values , the condition for domain integrity states that values of entity must be from within its domain > referential integrity ----- encountered in ERDs , in SQL etc , where an entity is distinguishable from other entities through some attributes that are specific to that entity , this condition is called as referential integrity in SQL referential integrity is imposed through a declarative mechanism called as triggers . A trigger is a statement that is automatically executed in SQL as a side effect of modification to database , triggers enforce referential integrity through violating the very condition that invoked them. Whereas in ERDs referential integrity is enforced with foreign keys , we have two relation schemas r1 and r2 where for tuples t1 , t2 we have primary key values k1 , k2 in r1 , r2 such that we can find A , a subset of r2 , a foreign key referencing k1 in r1 such that the following holds true ---- t1[k1]= t2[A] . this is example of referential integrity constraint . For referential integrity to be valid here , either k1 must be equal to A or k1 and A must be compatible attributes 17 weak entity set is entity set whose attributes do not form a primary key strong entity set is entity set whose attributes form a primary key 18 two phase locking protocol guarantees serializability but does not provide freedom from deadlock . Built upon two distinct stages ---- the growth stage and shrinking stage the starting point in this protocol is growing stage where the transaction can only isue new lock requests and obtain new locks and it cannot release acquired locks , the point in the growing stage where transaction has obtained its final lock and can command no more lock requests is lock point of transaction, it thereafter enters the shrinking stage where transaction can release locks but cannot obtain any new locks if transactions are ordered according to their lock points , it is a serializable ordering for the transactions 19
|