Changes between Version 2 and Version 3 of ConsiderationsForUsingTheFormatToCreateADatabase


Ignore:
Timestamp:
12/29/10 08:45:25 (8 years ago)
Author:
stels
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ConsiderationsForUsingTheFormatToCreateADatabase

    v2 v3  
    77Sometimes it might be needed to have a fixed value that shows from which visit or merger a value originates, this does not only apply to the //VIS// table but could be applied to all tables. This however does depend on the nature of the database and needs for data management, the field below should be considered an administrative support field for data management. 
    88 
    9  TODO:: where to store the VISIT fields? 
     9||= **VISIT** =|| 
     10|| Visit number || 
     11|| Numeric: [[BR]] `0` = Baseline Visit [[BR]] `1` = First follow up visit [[BR]] `2` = Second follow up visit [[BR]] etc. 
    1012 
    1113Often the above field is used for clinical trials databases where there is a need to associate the data directly with a given week’s follow-up.  Codes could then be the week number e.g. 4, 12, 24 etc or –1 for screening/randomisation and 0 for baseline visits. 
     
    1315In some cases it might be useful to have a separate field that defines the correct order of the periods. This becomes important where several dates are incomplete (unknown days, unknown months and possibly unknown years). The ordering by date would then not be correct. 
    1416 
    15 One solution to this is use a //PERI_ID// field that numbers the periods from the 1st until N^th^ usage: 
     17One solution to this is use a //PERI_ID// field that numbers the periods from the 1^st^ until N^th^ usage: 
    1618 
    17  TODO:: where to store the PERI_ID field? 
     19||= **PERI_ID** =|| 
     20|| Period of usage (1^st^, 2^nd^, 3^rd^ etc.) || 
     21|| Numeric || 
    1822 
    1923However this is an optional field that for most cohorts may not be needed. It also requires additional maintenance to keep it updated. 
     
    2125For databases that work with double data entry, such as most clinical databases, it becomes necessary to make each data entry unique and backwards traceable. For this to work a field like the above would have to be part of the primary key of each table that requires double data entry. 
    2226 
    23  TODO:: where to store the ENTRY_ID field? 
     27||= **ENTRY_ID** =|| 
     28|| Number of data entry || 
     29|| Numeric: [[BR]] `1` = first data entry [[BR]] `2` = second data entry [[BR]] `3` = comparison of 1^st^ and 2^nd^ data entry [[BR]] `4` = final approved record including corrections || 
    2430 
    2531With respect to performance, it might also be a good design to have 3 copies of each table, one to hold the data while being entered and compared, one for the two data entries to be archived into once a final record has been approved and a table holding the final and approved values. This way it is avoided that queries will have to work on checking for `ENTRY_ID = 4` and to select amongst a table holding 3 times the almost same data. 
     
    2733As part of an audit trail in a database a time stamp field could be added for each record to fix the exact time when the record last was inserted or updated. Along with the time stamp name of the user who entered or altered data can be recorded. 
    2834 
    29  TODO:: where to store the T_STAMP and USER_LOG fields? 
     35||= **T_STAMP** =||= **USER_LOG** =|| 
     36|| Date and time of data entry || Username of user that last inserted or updated data || 
     37|| yyyy-mm-dd hh:mm:ss || character || 
    3038 
    31 Often its necessary to keep a log of user action in a separate table. The above suggestion will only be valid for inserts and updates, and only be valid for the most recent action performed.  
     39Often it's necessary to keep a log of user action in a separate table. The above suggestion will only be valid for inserts and updates, and only be valid for the most recent action performed.  
    3240 
    3341To record a complete audit trail a logging facility must be implemented. In most database management systems this is done using triggers on the tables. For any insert, update or delete actions performed on the data, the user, time stamp, old value and new value are recorded in the logging table. 
     
    5361This setup not only enables integrity of the data, but also defines the domain^1^ for the //#_ID// values and enables data to both become human readable and easily recoded^2^. 
    5462 
     63An important note on lookup tables is that the performance on a large database can be slowed significantly by using character based keys to link lookup tables with the primary table as it is presented in this document. A work around is to use numeric value for the codes. 
     64 
    5565^1^: Domain is a term in the definition of the relational database model that defines a set of allowed values for a given set of fields (attributes), the mixing of different domains is not allowed in order to preserve the integrity of a relational and normalised model. 
    5666 
    5767^2^: Easily recoded permanently if the relation is specified as cascade on update or recoded dynamic by selecting a different column from the lookup table when querying the data through SQL 
    58  
    59 An important note on lookup tables is that the performance on a large database can be slowed significantly by using character based keys to link lookup tables with the primary table as it is presented in this document. A work around is to use numeric value for the codes. 
    6068 
    6169== Performance ==