wiki:Hicdep_1.60Printable

Version 1 (modified by trac, 6 years ago) (diff)

copy from version 1.50

This is a printer-friendly version of the HICDEP 1.30 specification available under http://www.hicdep.org/

HICDEP 1.30

This article describes the HICDEP version 1.30 which was released on the 28th of March 2008. It was superseded by version 1.50 on the 13th of September 2011. For a more detailed version history, please refer to the ChangeLog.

General data format

The table pages referenced in the overview describe the specific tables' structure in detail and present a list of suggested codes, both standard and human readable.

All codes apart from trivial no, yes or unknown codes are presented as lookup tables, the usage of these are described in the the article Considerations for using the format to create a database.

Along with the basic structure described in each “Core fields” section, additional fields containing additional or more specific data are described in the “Additional fields” sections. These fields were taken from several cohort collaborations but with the required changes that were needed for the specific data structures. This is presented to the reader to show that the core structure is not a fixed proposal but rather a basic structure, which can be altered by adding fields.

Issues regarding duplicates are discussed in Considerations For Data Management.

Overview of data tables

Table Content
tblAE holds type and date of adverse events including serious non-AIDS conditions
tblART holds type of antiretroviral drug, start and stop dates and reason for stopping
tblBAS holds basic information such as demographics, basic clinical information, date of AIDS diagnosis, death and drop-out information
tblDIS holds type and date of CDC-C diseases.
tblLAB holds type, date, value and unit of laboratory tests.
tblLAB_BP holds date, diastolic and systolic values and unit of blood pressure measurements.
tblLAB_CD4 holds date and value of CD4 measurements.
tblLAB_RNA holds date, value, detection limit and type of viral assay.
tblLAB_RES holds background information on the resistance test, laboratory, library, kit, software and type of test
tblLAB_RES_LVL_1 holds nucleoside sequence for the PRO and RT sequences
tblLAB_RES_LVL_2 holds mutations and positions of these.
tblLAB_RES_LVL_3 holds resistance result in relation to antiretroviral drug.
tblLAB_VIRO holds test results for viro-/serological tests (hepatitis etc.)
tblLTFU holds data in death and drop-out
tblMED holds type, start and stop dates for other HIV related medicines.
tblOVERLAP holds information on the patient's participation in other cohorts
tblVIS holds visit related information, weight, wasting.

Diagram

diagram of HICDEP tables with relations

Structure of data

From flat files towards a normalized structure

The data collected in HIV collaborations is presented on the following pages in a set of data files/tables. Typically data would be put into one data file that would hold one line/record per patient where each field is represented as a separate column in that dataset. Often a dataset could contain more than 3000 columns of data.

The implication of going from thousands of fields to fewer fields means that data is in fact transposed from the flat format into the normalised format.

Example of a flat file structure:

PATIENT ALAT_D ALAT_V ALAT_U ASAT_D ASAT_V ASAT_U
999999 01-01-2000 15 U/l 01-01-2000 12 U/l

The normalised structure would then be like this:

PATIENT TYPE_ID LAB_DATE LAB_VAL LAB_UNIT
999999 1 01-01-2000 15 U/l
999999 2 01-01-2000 12 U/l

The type of measurement is identified through the TYPE_ID field. Here 1 codes for ALAT and 2 codes for ASAT:

Code Description
1 ALAT - Alanin-Aminotransferase
2 ASAT - Aspartat aminotransferase

Technical considerations

To enable a normalised structure that minimises the number of columns dramatically, the one file solution must be broken into several minor tables. These breakdowns are driven by the different data characteristics.

Each table has a basic structure that includes the patient identifier, a code that represents e.g. drug, adverse event or laboratory test performed. Along with this combination values like date, result, unit etc are present for each record.

A record for a laboratory measurement would include:

  • Patient identifier
  • Measurement type identifier
  • Measured value
  • Unit of value
  • Date of determination

A record for usage of an antiretroviral drug would include:

  • Patient identifier
  • Drug identifier
  • Start date for usage
  • End date for usage
  • Reason for discontinuation

These issues imply that a set of distinct tables must be generated based on the “nature” of the data. Since laboratory, medication and event data both cannot and should not be mixed at least 3 tables must be designed. Additionally there are other types of information that need their own domains: background information on the patient (height, birth date etc.), visit related data (weight, blood pressure, wasting etc.), and resistance testing (the latter requires more consideration due to the diversity of data present).

In this protocol further separation of data into different tables are presented. These separations are not only based on the rules for the relational model and normalisation, but they are “culturally” related.

For example: antiretroviral treatment medication is kept in one table and other medication in another table; CD4 cell measurements and HIV-RNA measurements are put into separate tables, that are also different from the general laboratory table. These separations are done simply because data in these tables are of distinct importance in analysis and often are gathered more frequently and with more attention than other variables.

Coding Conventions

Date codes

Although it is best to have precise dates in the format of YEAR-MONTH-DAY ISO standard, it might be that some cohorts are limited to representing date data at the level of the month only, or information kept on the patient in the charts only defines dates to the month and in some cases only to the year. To solve this a set of date codes are presented here.

Day unknown

In this case the date should be coded as the 15th of the month – so that 1999-12-?? becomes 1999-12-15. This enables the date to be no more than 15 days away from the actual date.

Month and day unknown

Best approach to this is to apply something similar, as with unknown dates, this would then mean that 1999-??-?? becomes 1999-07-01.

Year unknown

If the year is unknown but the presence of the date value is needed as in case of opportunistic infections or adverse events (see later in this document) a fictive date should be used that couldn’t be mistaken with an actual date. An unknown year should be coded as 1911-11-11.

Specification of precision

An alternative to the above is to apply an additional field to each date field for which it is known that there might be issues regarding the precision of the dates. The field is then used to specify at which degree of the day, month or year the date is precise:

Code Precision of date
< Before this date
D Exact to the date
M Exact to the month
Y Exact to the year
> After this date
U Unknown

ICD-10 codes

The coding system is the official standard for coding of diseases, however there is a wide set of “homebrew” codes used within the HIV field in data coding in general, often it’s a 3 or 4 letter codes that is an abbreviation for the AIDS defining disease. ICD-10 doesn’t have single codes that represent all single CDC-C events and as a consequence of this a list of 3 to 4 letter codes is the recommended way of coding for all CDC stage C events

ICD-10 codes are however the recommended for codes AE’s since it would become impossible for this protocol to maintain a complete list of all possible AE’s. ICD-10 is also recommended for causes of death.

ATC codes

ATC is a hierarchical structure for coding medication. The structure and hierarchy are best explained with an example of how a drug code is defined. Here it is on Indinavir:

J
ANTIINFECTIVES FOR SYSTEMIC USE (1st level, anatomical main group)
J05
ANTIVIRALS FOR SYSTEMIC USE (2nd level, therapeutic subgroup)
J05A
DIRECT ACTING ANTIVIRALS (3rd level, pharmacological subgroup)
J05AE
Protease inhibitors (4th level, chemical subgroup)
J05AE02
Indinavir (5th level, chemical substance)

This hierarchy has some benefits as will be explained later, but one of its limitations is that it’s impossible to “read” the code compared to the widely used 3 letter mnemonic codes for antiretroviral drugs.

Example:

Drug Code ATC code
Indinavir IDV J05AE02

The difference is that the IDV code is easily readable, where the ATC code is not; going from a flat file structure to a normalised structure the human readable aspect becomes increasingly important. In the flat file format the column names and the possibility of labels makes data more or less readable; in the normalised format only the coding can help. Because of this the 3 letter codes are being presented in this document. However it must be stressed that usage of the ATC coding should be used to diminish the risk of several homebrew and non-compatible coding schemes.

Currently however, the ATC scheme does not provide sufficient detail on the specific drugs, there is e.g. no official way to code Saquinavir as hard or soft gel. Thus a slight alteration to the set of codes will be presented in the sections of the ART and MED tables. The alterations are designed to extend the existing structure of ATC.

One of the benefits is that the structure of ATC allows easier statistics on e.g. drug classes

  • J05AE Protease inhibitors
    • J05AE01 Saquinavir
    • J05AE02 Indinavir
    • J05AE03 Ritonavir
    • J05AE04 Nelfinavir
    • J05AE05 Amprenavir
    • J05AE06 Lopinavir
  • J05AF Nucleoside and nucleotide reverse transcriptase inhibitors
    • J05AF01 Zidovudine
    • J05AF02 Didanosine
    • J05AF03 Zalcitabine
    • J05AF04 Stavudine
    • J05AF05 Lamivudine
    • J05AF06 Abacavir
    • J05AF07 Tenofovir disoproxil
    • J05AF30 Combinations1
  • J05AG Non-nucleoside reverse transcriptase inhibitors
    • J05AG01 Nevirapine
    • J05AG02 Delavirdine
    • J05AG03 Efavirenz

Although the codes might be harder to read they provide grouping mechanisms in the way they are coded. Interested readers should go to the ATC Website to learn about the structure of ATC. A fully updated database of ATC codes and DDD (Defined Daily Dosage) is available for querying.

Other codes

It is often necessary to code for values like “Yes”, “No” and “Unknown”, this document suggests that the following codes should be used:

Code Description
0 No
1 Yes
9 Unknown

Unknown should be used to identify the difference between a value that has not yet been collected (Empty) and a value that cannot be collected (Unknown). Empty values should be required where Unknown values make little sense to keep querying for a value.

Example – weight:

Depending on the unit in which weight is measured, a different value for Unknown should be applied. In the case of kg the “Unknown” code should be 999 and not just 9 or 99, the last two could be actual values.

Blank values, for SAS users also known as "." and for database programmers known as NULL, should be used wherever specified in this protocol. However, sometimes it might be more correct just to omit the record if no value has been recorded, test has not been performed etc.

tblBAS - Basic clinical, background and demographic information

holds basic information such as demographics, basic clinical information, date of AIDS diagnosis, death and drop-out information

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • CENTER: Code for Clinic/Centre/Hospital where patient is seen.
  • BIRTH_D: Birth date.
  • FRSVIS_D: First seen at clinic
  • ENROL_D: Date of enrolment into the cohort
  • GENDER: Gender/sex
  • HEIGH: Height of patient at visit/most current
  • MODE: Mode of infection
  • ORIGIN: Nationality or region of origin of patient
  • ETHNIC: Ethnicity of patient
  • SEROCO_D: Date of seroconversion
  • RECART_Y: Has the patient received antiretroviral treatment?
  • AIDS_Y: Has patient been given an AIDS diagnosis?
  • AIDS_D: IF YES, date of AIDS diagnosis

Additional fields

TODO
why is DEATH_OT in tblBAS and not tblLTFU?

For mode of infection, origin and death a set of other fields are often used to capture what cannot be coded. These fields are represented here as optional fields as it is the intention that the suggested codes applied to the MODE, ORIGIN, DEATH_R1-3 and ICD10_1-3 should be able to cover all possible values.

  • MODE_OTH: Mode of infection OTHER
  • ORI_OTH: Origin of patient OTHER
  • DEATH_OT: Reason for death – other - description

tblLTFU - Death and drop-out

holds data in death and drop-out

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • DROP_Y: Has the patient DROPPED OUT?
  • DROP_D: IF YES, Date of Last Visit
  • DROP_RS: IF YES, Reason for DROP
  • DEATH_Y: Has the patient died?
  • DEATH_D: Date of Death
  • AUTOP_Y: Was an autopsy Performed?
  • DEATH_R1: Cause of death
  • DEATH_RC1: Coding of causal relation of the code given in DEATH_R1 to the death
  • DEATH_R2: Cause of death
  • DEATH_RC2: Coding of causal relation of the code given in DEATH_R2 to the death
  • DEATH_R3: Cause of death
  • DEATH_RC3: Coding of causal relation of the code given in DEATH_R3 to the death

List of DEATH_R# and DEATH_RC# should be continued for as many reasons that are recorded.

The DEATH_RC# fields should enable cohorts to transfer data in accordance with the Coding of Death project (CoDe). You are welcome to contact the CoDe group for electronic sample forms for detailed collection of data used for the CoDe review process.

CoDe defines 1 immediate, 2 contributing and 1 underlying cause of death.

Additional fields

  • ICD10_1: Cause of death as ICD-10 if available
  • ICD10_2: Cause of death as ICD-10 if available
  • ICD10_3: Cause of death as ICD-10 if available

List of ICD10_# inplace of or together with DEATH_R# and together DEATH_RC# and should be continued for as many reasons that are recorded.

CoDe defines 1 immediate, 2 contributing and 1 underlying cause of death.

  • L_ALIVE: Last date known to be alive

tblOVERLAP - Cross-cohort identification

holds information on the patient's participation in other cohorts

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • COHORT: Code/name of the cohort
  • PAT_OTH: Unique patient identifier in other cohorts
  • COH_OTH: Name of the cohort

Patients of an “original”-cohort who also participate in a “super”-cohort should be analysed within the “original”-cohort only. To suppress these patients from the datasets of the “super”-cohorts the identifier used in the “super”-cohort is needed. It is suggested that “original”-cohorts report id’s from the “super”-cohorts, since the “super”-cohorts might not even know the other ID’s. Often this information is only available at centre level.

A record should be present for each cohort that the patient is participating in (apart from it’s own “original”-cohort).

tblVIS - Basic follow-up/visit related data

holds visit related information, weight, wasting.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • VIS_D: Date of patient visit
  • WEIGH: Weight of patient at visit
  • GAIN_Y: Is the patient gaining fat in the abdomen, neck, breast or other defined locations?
  • LOSS_Y: Is the patient experiencing loss of fat from extremities, buttocks or face?

Depending on the collaboration this data might be collected in intervals of a year, e.g. from July last to July this year. In that case all visit dates or a fixed number of visit dates for that period should be gathered, if the patient did not have a visit in the defined period, a record with the PATIENT id and empty fields for VIS_D etc. should be included.

tblART - Antiretroviral treatment

holds type of antiretroviral drug, start and stop dates and reason for stopping

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: identifies patient
  • ART_ID: represents the antiretroviral treatment
  • ART_SD: date of initiation of treatment
  • ART_ED: date of stopping treatment
  • ART_RS: reason for stopping treatment

Additional fields

Depending on the aim of the study it might be needed to gather both the dosage and the frequency of the dosage taken. However many cohorts do not collect this date and thus these fields are optional.

PATIENT field

containing table
tblART
explanation of variable
Code to identify patient (Cohort Patient ID)
format of data
character (or numeric if possible)
exists since HICDEP version
1.30

ART_ID field

containing table
tblART
explanation of variable
Code representing the antiretroviral treatment
format of data
character. see coding table for valid codings.
exists since HICDEP version
1.30

Coding Table

A set of extended ATC codes are being presented here in order to code both more specific on subtypes of the drugs, e.g. saquinavir hard and soft gel, but also to enable coding of drugs that are at their trial stage and have not yet been assigned an ATC code. To do this the drug will be assigned the code elements as far down the levels as possible. Given two examples to illustrate this:

Saquinavir - Hard Gel
J05AE01-SQH
Saquinavir - Soft Gel
J05AE01-SQS
Saquinavir - not specified
J05AE01

This will ensure the fidelity needed to distinguish between hard and soft gel and not specified, but also for analysis easily include all records which coding starts with J05AE01, regardless if the drug is hard or soft gel.

See the ATC Index for the individual codes.

save Download this table as CSV filesave Download this table as XML file
Code (Extended ATC Codes)Anti-Retroviral Drugs
J05AART unspecified
J05A-BEVBeviramat
J05A-PBTParticipant in Blinded Trial
J05AEPI unspecified
J05AE-MOZMozenavir (DMP-450)
J05AE01Saquinavir (gel, not specified)
J05AE01-SQHSaquinavir hard gel (INVIRASE)
J05AE01-SQSSaquinavir soft gel (FORTOVASE)
J05AE02Indinavir (CRIXIVAN)
J05AE03Ritonavir (NORVIR)
J05AE03-HRitonavir high dose (NORVIR)
J05AE03-LRitonavir low dose (NORVIR)
J05AE04Nelfinavir (VIRACEPT)
J05AE05Amprenavir (AGENERASE)
J05AE06Lopinavir/Ritonavir (Kaletra)
J05AE07Fos-amprenavir (Telzir, Lexiva)
J05AE08Atazanavir (Reyataz)
J05AE09Tipranavir (Aptivus)
J05AE10Darunavir (TMC-114, Prezista)
J05AFNRTI unspecified
J05AF-ALOAlovudine
J05AF-AMDAmdoxovir (DADP)
J05AF-FOZFozivudine tidoxi
J05AF-LDNLodenosine (trialdrug)
J05AF-RVTReverset
J05AF01Zidovudine (AZT, RETROVIR)
J05AF02Didanosine (ddI) (VIDEX)
J05AF03Zalcitabine (ddC) (HIVID)
J05AF04Stavudine (d4T) (ZERIT)
J05AF05Lamivudine (3TC, EPIVIR)
J05AF06Abacavir (1592U89) (ZIAGEN)
J05AF07Tenofovir (VilREAD)
J05AF08Adefovir (PREVEON)
J05AF09Emtricitabine
J05AF10Entecavir
J05AF11Telbivudine
J05AGNNRTI unspecified
J05AG-CPVCapravirine
J05AG-DPC083DPC 083
J05AG-DPC961DPC 961
J05AG-EMVEmivirine (MKC442)
J05AG-ETVEtravirine (TMC 125)
J05AG-LOVLoviride
J05AG-RPVRilpivirine (TMC-278)
J05AG01Nevirapine (VIRAMUN)
J05AG02Delavirdine (U-90152) (RESCRIPTOR)
J05AG03Efavirenz (DMP-266) (STOCRIN, SUSTIVA)
J05AR01Combivir (Zidovudine/Lamivudine)
J05AR02Kivexa (Lamivudine/Abacavir)
J05AR03Truvada (Tenofovir/Emtricabine)
J05AR04Trizivir (Zidovudine/Lamivudine/Abacavir)
J05AR05Douvir-N (Zidovudine/Lamivudine/Nevirapine)
J05AR06Atripla (Emtricitabine/Tenofovir/Efavirenz)
J05AX-EVGElvitegravir (Gilead)
J05AX-VICVicriviroc (Schering)
J05AX07Enfurvirtide (Fuzeon, T-20)
J05AX08Raltegravir (Merck)
J05AX09Maraviroc (Pfizer)
L01XX05Hydroxyurea/Hydroxycarbamid (Litalir)

ART_SD field

containing table
tblART
explanation of variable
Date of Initiation of treatment
format of data
yyyy-mm-dd
exists since HICDEP version
1.30

ART_ED field

containing table
tblART
explanation of variable
Date of stopping treatment
format of data
yyyy-mm-dd
exists since HICDEP version
1.30

ART_RS field

containing table
tblART
explanation of variable
Reason for stopping treatment
format of data
character. see coding table for valid codings.
exists since HICDEP version
1.30

Coding Table

save Download this table as CSV filesave Download this table as XML file
CodeReason for Stopping Treatment
1Treatment failure (i.e. virological, immunological, and/or clinical failure
1.1Virological failure
1.2Partial virological failure
1.3Immunological failure - CD4 drop
1.4Clinical progression
2Abnormal fat redistribution
3Concern of cardiovascular disease
3.1Dyslipidaemia
3.2Cardiovascular disease
4Hypersensitivity reaction
5Toxicity, predominantly from abdomen/G-I tract
5.1Toxicity - GI tract
5.2Toxicity - Liver
5.3Toxicity - Pancreas
6Toxicity, predominantly from nervous system
7Toxicity, predominantly from kidneys
8Toxicity, predominantly from endocrine system
8.1Diabetes
9Haematological toxicity (anemia etc.)
10Hyperlactataemie/lactic acidosis
88Death
90Side effects - any of the above but unspecified
90.1Comorbidity
91Toxicity, not mentioned above
92Availability of more effective treatment (not specifically failure or side effect related)
92.1Simplified treatment available
92.2Treatment too complex
92.3Drug interaction
93Structured Treatment Interruption (STI)
93.1Structured Treatment Interruption (STI) - at high CD4
94Patient's wish/decision, not specified above
94.1Non-compliance
95Physician's decision, not specified above
96Pregnancy
97Study treatment
98Other causes, not specified above
99Unknown

Although most reasons mentioned above could be coded in ICD-10 codes, it is still certain that many reasons will never be defined in terms of ICD-10 (STI, physicians and patients decision), so to avoid a mixture of two different coding systems it is advised to go with the above list and build upon that for now.

tblMED - Other medication

holds type, start and stop dates for other HIV related medicines.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • MED_ID: Code representing the treatment
  • MED_SD: Date of Initiation of Treatment
  • MED_ED: Date of stopping treatment

Additional fields

Please see tblART - Antiretroviral treatment.

tblDIS - Opportunistic infections

holds type and date of CDC-C diseases.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • DIS_ID: Code to identify event
  • DIS_D: Date of event
  • DIS_WD: Means of diagnosis
  • DIS_OTH1: Other location, only to be filled out if code alone is not sufficient

1 DIS_OTH might be part of the record's unique identification

Additional fields

Please see tblAE - Adverse Events for specification on optional fields.

tblLAB - Laboratory values

holds type, date, value and unit of laboratory tests.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • LAB_ID: Code representing the measurement
  • LAB_D: Date of measurement/sample
  • LAB_V: Value of measurement
  • LAB_U: Unit of measurement

Additional fields

Other detailed information regarding the patient and the measurement would be relevant, like the proposed fasting information shown below.

  • LAB_FA: Was the blood sample taken while fasting?
  • LAB_ST: Specimen type

Depending on the set of measurements collected and the mandatory fields applicable to these individual measurements, it might be useful to separate the LAB table into several sub tables. This is already shown for the CD4 and RNA measurements: the level of detail needed for CD4 is less than for the LAB variables in general (no unit since it’s always the same), while for RNA the data required is more detailed (assay and detection limit).

tblLAB_CD4 - Laboratory values

holds date and value of CD4 measurements.

Note: If needed, a CD8 table (tblLAB_CD8) could be formed from the same structure.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • CD4_D: Date of measurement
  • CD4_V: Value of CD4 measurement

Additional fields

CD4_V is assumed to contain absolute CD4 cell counts per mL as standard. In case CD4 % should be collected as well, please append the following field to the table:

  • CD4_U: Unit of measurement

tblLAB_RNA - Laboratory values

holds date, value, detection limit and type of viral assay.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • RNA_D: Date of Measurement/Sample
  • RNA_V: HIV-RNA measurement value
  • RNA_L: Lower Limit of HIV-RNA Assay
  • RNA_T: IF AVAILABLE, What type of VIRAL ASSAY was used for this measurement?

Additional fields

  • RNA_UL: IF AVAILABLE, Upper Limit of assay

tblLAB_BP - Laboratory values - Blood pressure

holds date, diastolic and systolic values and unit of blood pressure measurements.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • BP_D: Date of Measurement/Sample
  • BP_SYS: Systolic Blood Pressure
  • BP_DIA: Diastolic Blood Pressure
  • BP_U: Unit of measurement

tblLAB_VIRO - Laboratory values - viro-/serology

holds test results for viro-/serological tests (hepatitis etc.)

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • VS_ID: Viral test
  • VS_D: Measurement date
  • VS_R: Measurement result
  • VS_V: Measurement value (HCV-RNA & HBV-DNA only) (copies/ml)
  • VS_U: Measurement unit

Additional fields

  • VS_LL: IF AVAILABLE, Lower limit of assay
  • VS_UL: IF AVAILABLE, Upper limit of assay
  • VS_T: IF AVAILABLE, type of ASSAY used for this measurement

tblLAB_RES - Resistance testing

holds background information on the resistance test, laboratory, library, kit, software and type of test

Note: This table is tightly linked to tblLAB_RES_LVL_1, tblLAB_RES_LVL_2 and tblLAB_RES_LVL_3.

Resistance should be reported at lowest level of interpretation possible – so if the nucleotide sequence is available this should be reported rather than the list of mutations or resistance scores. However, the resistance test results should be captured if they have been part of the physician’s treatment decisions for the patient.

These four tables are designed to capture several possible formats the clinics and cohorts might have recorded resistance test data in. Once this data is gathered it should like all other tables be quality assessed. For the full nucleotide sequences a short guide on “Sequence Quality Control” can be found here:

TODO
fix the following link

http://hiv-web.lanl.gov/content/hiv-db/CONTAM/contam_main.html

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • PATIENT: Code to identify patient (Cohort Patient ID)
  • SAMP_ID: The assigned sample ID
  • SAMPLE_D: Date of the actual sample taken (NOT the test date)
  • SEQ_DT: Date and time when the sequencing was performed
  • LAB: Name of laboratory where the test was performed
  • LIBRARY: Library/algorithm used to identify resistance mutations
  • REFSEQ: Name/identifier of reference HIV strain used to find mutations
  • KIT: Vendor and version/name of the kit used for the test
  • SOFTWARE: Software and version used to determine resistance
  • TESTTYPE: Type of test
  • SUBTYPE: Subtype of HIV-RNA

Additional fields

As shown with the core fields, the SAMP_ID is the link between the 3 levels of data and the test background information table. The sample identifier, however, must be unique for the format to work. This might not always be the case. If needed SAMPLE_D could be used as an additional part of the key, or just SAMPLE_D along with the PATIENT key1.

Some prior assessment of the assigned sample identifiers has to be done in order to avoid duplicates.

In a running database the duplicate issues are easily resolved by adding a unique auto-generated key as the identifier between 3 levels of data and the test background information table SAMP_ID.

Along with the SAMP_ID it might be necessary to store the ID assigned to the sample at both the testing laboratory but also the centres laboratory in order to track the sample. Each of these could also be used as the SAMP_ID value.

1: However this raises the issue about several aliquots from the same day will look like duplicates in the tables.

  • SAMP_LAB: The assigned sample ID at the lab where the resistance test is preformed.
  • SAMP_INT: The assigned sample ID from the centre.

tblLAB_RES_LVL_1 - Nucleotide sequences (PRO, RT, GP41, GP120)

holds nucleoside sequence for the PRO and RT sequences. No entry is made if the test was a phenotype test.

Note: This table is tightly linked to tblLAB_RES.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • SAMP_ID: The assigned sample ID
  • SEQTYPE: Type of nucleotide sequence if available
  • SEQ_START: Start position for the sequence
  • SEQ_STOP: Stop position for the sequence
  • SEQ_NUC: Nucleotide sequence if available

Additional fields

In cases where the amino acid sequence is collected rather than the nucleotide sequence, the field SEQ_NUC might be replaced with SEQ_AA, which is the nucleotide sequence, expressed in an amino acid sequence:

  • SEQ_AA: Amino acid sequence if available (empty if test was phenotype)

However using the amino acid sequence does not give the same detail of data as the nucleoside sequence: wobbles in the nucleoside sequence can either complicate the reading and alignment of the amino acid sequence or the wobbles can be lost and silent mutations are lost.

tblLAB_RES_LVL_2 - Mutations

holds mutations and positions of PRO and RT sequences.

Note: This table is tightly linked to tblLAB_RES.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • SAMP_ID: The assigned sample ID
  • GENE: Type of sequence/gene (PRO, RT, GP41, GP120)
  • AA_POS: Position of the mutation in the sequence
  • AA_POS_SUB: Subposition used to code insertions
  • AA_FOUND_1: Mutation (Amino acid) found in the sequence
  • AA_FOUND_2: Mutation (Amino acid) found in the sequence (if more than 1)
  • AA_FOUND_3: Mutation (Amino acid) found in the sequence (if more than 2)
  • AA_FOUND_4: Mutation (Amino acid) found in the sequence (if more than 3)

AA_FOUND_# could be extended if mixtures with more than 4 amino acids are found.

tblLAB_RES_LVL_3 - Resistance test result

holds resistance result in relation to antiretroviral drug.

Note: This table is tightly linked to tblLAB_RES.

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • SAMP_ID: The assigned sample ID
  • ART_ID: Drug code of antiretroviral
  • RES_SCOR: Score of resistance or recommendation given from the test.

Additional fields

For phenotype test results it will be necessary to extend the table with a field to store the cut-off value:

  • RES_CUT: Cut-off value for phenotype test result

However using the amino acid sequence does not give the same detail of data as the nucleoside sequence: wobbles in the nucleoside sequence can either complicate the reading and alignment of the amino acid sequence or the wobbles can be lost and silent mutations are lost.

tblAE - Adverse Events

holds type and date of adverse events including serious non-AIDS conditions

Core fields

Note: Fields marked bold form the unique identifier for a record of the table.

  • EVENT_ID: foreign key to event tables
  • PATIENT: identifies patient
  • AE_D: date of event
  • AE_ID: identifies type of event
  • AE_SPEC: further specification
  • SRCDOC_Y: whether the source documentation is available
  • SRCDOC_D: date for source documentation verification
  • VERIFY_Y: Has the monitor verified the source documentation?
  • VERIFY_D: date for monitor verification
  • APPROV_Y: final verification/approval
  • APPROV_D: final verification date
  • APPROV_S: signature for final verification
TODO
where to put the following text?

Please see the HICDEP website for examples of detailed AE tables for the events listed in 3.16.2 below. Data format is available in the HICDEP DAD event forms document.

Additional fields

  • AE_Y: has the patient had an event?
  • AE_NAME: full name of the event
  • AE_DESCRIP: full description of the event
  • AE_R_Y: relation to treatment

Considerations for data management

Uniqueness of PATIENT ID

Each table in this document is shown with the PATIENT field as part of the unique identifier for each record, in many cases it might be necessary to specify both PATIENT and COHORT as identifier since the PATIENT id might not be unique across cohorts, this should however be implemented only if needed.

Another solution, which might be worth considering when building the final dataset for the analysis, is to concatenate the patient id and the cohort id into a single unique new patient id.

Duplicate records

Transposition of the flat format data, where there is one record per patient, into a normalised structure that has multiple rows per patient requires that the combination of PATIENT, TYPE_ID and LAB_DATE is unique for each row in the table.

The normalised structure and the relational model does not allow for duplicate records like in the following table to exist:

PATIENT TYPE_ID LAB_DATE LAB_VAL LAB_UNIT
999999 1 2000-01-01 15 U/l
999999 1 2000-01-01 15 U/l

The typical solution to this is to generate an auto-incremented value – RECORD_ID - for each record in the table:

RECORD_ID PATIENT TYPE_ID LAB_DATE LAB_VAL LAB_UNIT
1 999999 1 2000-01-01 15 U/l
2 999999 1 2000-01-01 15 U/l

But it can, and should be dealt with in the design of the study. The presence of duplicate records like in the example shown above might not make any sense; if they did occur it then would be on account of a recording error. If, however, it was important to keep records that show measurements taken the same day but at different times, the format of the LAB_DATE shouldn’t be date (YYYY-MM-DD) but instead should be date-time (YYYY-MM-DD hh:mm:ss):

PATIENT TYPE_ID LAB_DATE LAB_VAL LAB_UNIT
999999 1 2000-01-01 10:00:00 15 U/l
999999 1 2000-01-01 14:00:00 15 U/l

Different and more sophisticated methods to make records unique will be presented later in this document under the definition of the LAB table.

Considerations for using the format to create a database

Administrative fields

Sometimes it might be needed to have a fixed value that shows from which visit or merger a value originates, this does not only apply to the VIS table but could be applied to all tables. This however does depend on the nature of the database and needs for data management, the field below should be considered an administrative support field for data management.

VISIT
Visit number
Numeric:
0 = Baseline Visit
1 = First follow up visit
2 = Second follow up visit
etc.

Often the above field is used for clinical trials databases where there is a need to associate the data directly with a given week’s follow-up. Codes could then be the week number e.g. 4, 12, 24 etc or –1 for screening/randomisation and 0 for baseline visits.

In some cases it might be useful to have a separate field that defines the correct order of the periods. This becomes important where several dates are incomplete (unknown days, unknown months and possibly unknown years). The ordering by date would then not be correct.

One solution to this is use a PERI_ID field that numbers the periods from the 1st until Nth usage:

PERI_ID
Period of usage (1st, 2nd, 3rd etc.)
Numeric

However this is an optional field that for most cohorts may not be needed. It also requires additional maintenance to keep it updated.

For databases that work with double data entry, such as most clinical databases, it becomes necessary to make each data entry unique and backwards traceable. For this to work a field like the above would have to be part of the primary key of each table that requires double data entry.

ENTRY_ID
Number of data entry
Numeric:
1 = first data entry
2 = second data entry
3 = comparison of 1st and 2nd data entry
4 = final approved record including corrections

With respect to performance, it might also be a good design to have 3 copies of each table, one to hold the data while being entered and compared, one for the two data entries to be archived into once a final record has been approved and a table holding the final and approved values. This way it is avoided that queries will have to work on checking for ENTRY_ID = 4 and to select amongst a table holding 3 times the almost same data.

As part of an audit trail in a database a time stamp field could be added for each record to fix the exact time when the record last was inserted or updated. Along with the time stamp name of the user who entered or altered data can be recorded.

T_STAMP USER_LOG
Date and time of data entry Username of user that last inserted or updated data
yyyy-mm-dd hh:mm:ss character

Often it's necessary to keep a log of user action in a separate table. The above suggestion will only be valid for inserts and updates, and only be valid for the most recent action performed.

To record a complete audit trail a logging facility must be implemented. In most database management systems this is done using triggers on the tables. For any insert, update or delete actions performed on the data, the user, time stamp, old value and new value are recorded in the logging table.

The T_STAMP field could also include information about which time zone is relevant for data entry. Depending on database requirements this might in fact be mandatory if the FDA’s 21 CRF part 11 on electronic records and signatures applies.

Further normalisation

Depending on performance considerations it might be worth looking at how data are queried for data entry and data analysis. A smaller tblBAS table might increase performance: Since processing a smaller table is always faster than processing a larger table, one could put drop-out, death, birthday, date of aids diagnosis, etc. into separate tables and keep the core patient list in a separate master table

But if the database is used e.g. for BMI calculations directly on the running database, performance might be enhanced by keeping the patient list and the height together in the same table so that a query involves 2 tables (tblBAS and tblVIS) rather than perhaps 3 or more.

Another consideration is space. Although it may not be much of an issue, it will be possible to minimise the actual size of the database by putting fields that may be empty for most patients, like death information, into a separate table in a 0-1 to 1 relation to the master table.

Lookup tables

In a running database the #_ID fields could be implemented as a foreign key to a linked lookup table containing all possible codes and their corresponding definitions in a text string.

ART lookup table

This setup not only enables integrity of the data, but also defines the domain1 for the #_ID values and enables data to both become human readable and easily recoded2.

An important note on lookup tables is that the performance on a large database can be slowed significantly by using character based keys to link lookup tables with the primary table as it is presented in this document. A work around is to use numeric value for the codes.

1: Domain is a term in the definition of the relational database model that defines a set of allowed values for a given set of fields (attributes), the mixing of different domains is not allowed in order to preserve the integrity of a relational and normalised model.

2: Easily recoded permanently if the relation is specified as cascade on update or recoded dynamic by selecting a different column from the lookup table when querying the data through SQL

Performance

As already outlined in the above section, there are also performance issues that may have to be considered.

When using the suggested data types presented in this document for a database implementation, it may be worth looking at the actual data at hand when defining the final data types. The aim of this document is to present a format that will work between cohorts with rather different setups.

If it is at all possible in many cases there may be a large performance gain by using numeric instead of character fields. Character fields have been suggested here for, amongst others, the PATIENT field. If the PATIENT id is purely numeric it’s worth using a numeric data type since it always faster for querying than a character field.

Whenever the field has to be character, make sure that only the needed amount of space is assigned for the field length; there is no need to assign 50 characters of memory if the field in fact only stores a 3-letter code.

Wiki page "CaseDefinitions" does not exist

Change log

Draft Version 1.100

Version 1.90

  • tblART
    • added new optional field GENERIC
    • added new optional field ART_RS2
    • added new optional field ART_RS3
    • added new optional field ART_RS4
    • ART_ID
      • Added new single tablet formulations:
        • J05AR15 (Atazanavir and cobicistat)
        • J05AR16 (Lamivudine and raltegravir)
        • J05AR17 (Emtricitabine and tenofovir alafenamide)
        • J05AR18 (Emtricitabine, tenofovir alafenamide, elvitegravir and cobicistat)
        • J05AR19 (Emtricitabine, tenofovir alafenamide and rilpivirine)
      • Deleted (moved to tblMED):
        • J05AF08";"Adefovir (PREVEON)"
        • J05AF10";"Entecavir"
        • J05AF11";"Telbivudine"
    • ART_RS
      • added new (sub-)codes for stopping medication
        • 11 (Bone toxicity)
        • 15 (Social contra-indication)
        • 16 (Contra-indication unspecified)
        • 16.8 (Contra-indication expired)
        • 16.9 (Contra-indication - other)
        • 17 (MTCT regimen completed)
        • 70 (Pregnancy - toxicity concerns (during pregnancy))
        • 75 (Pregnancy - switch to a more appropriate regimen for PMTCT)
        • 92.5 (Regular treatment termination (e.g. DAA's for HCV, antibiotics)
  • tblLTFU
    • DEATH_R1
      • added new subcodes:
        • 03.1.3 (HCV with liver cancer)
        • 03.2.3 (HBV with liver cancer)
  • tblMED
    • MED_ID
      • Added new drug codes:
        • A11CC (vitamin D)
        • G02CA (Tocolysis)
        • H02 (Corticosteroids)
        • J01 (Antibiotics)
        • J01GA01 (streptomycin)
        • J02AC05 (Isavuconazole)
        • J02AX04 (caspofungin)
        • J04AB05 (Rifapentine (Priftin))
        • J05AF12 (Clevudine)
        • J05AX GRAZ-ELB (Grazoprevir/Elbasvir)
        • J05AR-DAAS (Daclatasvir/Asunaprevir)
        • J07BM0 (HPV Vaccine)
        • J07BM01 (HPV Vaccine (types 6, 11, 16, 18))
        • J07BM02 (HPV Vaccine (types 16, 18))
        • J07BM03 (HPV Vaccine (types 6, 11, 16, 18, 31, 33, 45, 52, 58))
        • M05BA (bisphosphonate)
        • N05A (Antipsychotics)
        • N05CD (Benzodiazepine derivatives)
        • N05CF (Benzodiazepine related drugs)
        • N06A (Antidepressant)
        • N07BC (Other drugs used in opioid dependence)
        • N07BC01 (Buprenorphine)
        • N07BC02 (Methadone)
        • N07BC03 (Levacetylmethadol)
        • N07BC04 (Lofexidine)
        • N07BC51 (Buprenorphine, combinations)
        • V03AB15 (Naloxone)
      • Added new drug codes (moved from tblART):
        • J05AF08";"Adefovir (PREVEON)"
        • J05AF10";"Entecavir"
        • J05AF11";"Telbivudine"
      • Changed temporary drug names to ATC codes:
        • J05AR-A450OM (ABT-450/r/Ombitasvir) --> J05AX67 (Ombitasvir, paritaprevir(ABT-450) and ritonavir)
        • J05AR-LESO (Ledipasvir/Sofosbuvir) --> J05AX65 (Ledipasvir/Sofosbuvir)
        • J05AX-DBV (Dasabuvir) --> J05AX16 (Dasabuvir)
    • MED_RS
      • added new (sub-)codes for stopping medication (to keep identical to ART_RS)
        • 11 (Bone toxicity)
        • 15 (Social contra-indication)
        • 16 (Contra-indication unspecified)
        • 16.8 (Contra-indication expired)
        • 16.9 (Contra-indication - other)
        • 17 (MTCT regimen completed)
        • 70 (Pregnancy - toxicity concerns (during pregnancy))
        • 75 (Pregnancy - switch to a more appropriate regimen for PMTCT)
        • 92.5 (Regular treatment termination (e.g. DAA's for HCV, antibiotics)
  • tblVIS
    • Added field for Center CENTER
    • Added field for family history of CVD/Stroke FAM_Y
    • Added field for type of clinic/service CLIN_TYPE
    • Added field for speciality of physician SPEC_TYPE
    • Added field for stage of transition from adolescence to adulthood TRANS_STAGE

Version 1.80

  • tblART
    • ART_ID
      • Added:
        • J05AR11 (Lamivudine, tenofovir disoproxil and efavirenz)
        • J05AR12 (Lamivudine and tenofovir disoproxil)
        • J05AR13 (Lamivudine, abacavir and dolutegravir)
        • J05AR14 (Darunavir and cobicistat)
        • J05AX-CAB (Cabotegravir (GSK-744))
  • tblBAS
    • ETHNIC
      • Revised and added codes for different ethnicities
      • Changed limitations
      • Changed description
    • ORIGIN
      • Changed description to be more specific
  • tblLTFU
    • DEATH_R1
      • Added sub-types for 22 and 23
      • Added additional codes table
  • tblMED
    • MED_ID
      • Added:
        • J05AE11 (Telaprevir (INCIVEK, INCIVO))
        • J05AE13 (Faldaprevir)
        • J05AE14 (Simeprevir)
        • J05AE15 (Asunaprevir)
        • J05AX14 (Daclatasvir)
        • J05AX15 (Sofosbuvir)
        • J05AR-A450OM (ABT-450/r/Ombitasvir)
        • J05AR-DAAS (Daclatasvir/Asunaprevir)
        • J05AR-LESO (Ledipasvir/Sofosbuvir)
        • J05AX-DBV (Dasabuvir)
  • tblLAB
    • LAB_U
      • Added:
        • 14 (mg/24h)
        • 15 (mg/mmol)
        • 16 (fl)
        • 17 (µg/mL = mg/L)
        • 18 (µkat/L)
    • LAB_ID
      • Added:
        • ACRA (Albumin Creatinin Ratio)
        • PCRA (Protein Creatinin Ratio)
        • PROT (Protein)
        • PSA (Prostate-specific antigen)
        • PTH (Parathyroid Hormone)

Version 1.70

  • tblAE has been renamed to tblCEP
  • tblCEP
    • Added field CEP_V
    • CEP_ID
      • Added: HOSP, ICU, JAUN, LIVD, LIVT, USAB, ASP, BART, CHAG, NOCA, PCE, PMAR, REQU.
      • Added: BACT, ENDO, MENI, OSTI, PERI, PNEU, PYEL, LEIS, MCDI (Previously in tblDIS).
      • Removed: ANG, BYP, END as they are already in the field CEP_SPEC.
      • Removed COR as now specified with SUD_DEATH_Y and EXP_DEATH_Y in tblLTFU.
    • CEP_SPEC
      • Changed description of NADM - CERV to contain "grade 2 or higher"
      • Added values: LIVB - XX, HEP - XX, NADM - HENE, NADM - BRAIN
  • tblBAS
    • Added field RECART_D (Date ART started)
    • Added field LTART_D (Date last assessed for ART)
    • ETHNIC: Added Limitations section in description
    • Added field EDU_LVL (Last completed education Level)
    • Added field HIV_POS_D (Date of first positive HIV test)
  • tblDIS
    • DIS_ID: Removed HG, BACT, ENDO, MENI, OSTI, PERI, PNEU, PYEL, LEIS, MCDI (now in tblCEP).
  • tblVIS
    • Added optional field EMPLOY (What is the patient's current situation regarding labour?)
    • Added optional field CONTRACT (If the patient is an employee, what is the type of the patient's employment contract?)
    • Added optional field SMOKING_Y (Is the patient currently a smoker?)
    • Added optional field PREG_Y (Is the patient currently pregnant?)
    • Added optional field CDC_STAGE
    • Added optional field WHO_STAGE
  • tblLTFU
    • Added optional fields for ICD9 cause of death coding (ICD9_#)
    • Added optional field DEATH_SOURCE
    • Added optional fields SUD_DEATH_Y and EXP_DEATH_Y. Replacing COR
    • CoDe
      • Typos in the coding table (HBF -> HBV)
      • New codes for cancer (04.XX)
      • New codes for AMI (08.XX)
      • New code: 92.1 - Unknown, competing risks
  • tblART
    • ART_ID: Coding extended/updated to match current ATC codes.
      • Lopinavir / Ritonavir changed code from "J05AE06" to "J05AR10".
      • Elvitegravir changed code from "J05AX-EVG" to "J05AX11"
      • J05AR07 - Triomune
      • J05AR08 - Eviplera / Complera
      • J05AR09 - Stribild
      • J05AR10 - Kaletra / Aluvia
      • J05AX12 - Dolutegravir
      • V03AX03 - Cobicistat
    • ART_FR: Added 0.33 and 0.5. Added special value -1 for "frequency unknown".
    • ART_RS
      • Added new codes: 6.X, 92.31, 92.32, 92.33, 92.4, 92.9, 94.2, 96.1, 96.2, 97.1, 97.2, 97.6
  • tblLAB
    • LAB_ID
      • Removed redundant "APT" code. Use code "ALP"
      • Removed "GLYCE" (synonymous to Glucose in blood)
      • Added "AFP" (Alfa Fetoprotein) and "DIPP" (Dipstick result for protein in Urine).
    • LAB_U
      • Removed WBC codes (WBC was already removed from LAB_ID)
      • Added generic "13 - µg/L" and "99 - No units" code
    • LAB_ST
      • Added codes U and U24.
  • tblLAB_RNA
    • RNA_T
      • New code: 42 - Abbott RealTime HIV-1 m2000
      • New code: 59 - Monitor unspecified
  • tblMED
    • MED_ID
      • Changed description from "Other HIV-related drugs" to "Other medication"
      • Added codes for Boceprevir and Telaprevir
    • MED_RS
      • Same changes as in tblART ART_RS

Version 1.60

  • tblBAS:
    • The ORIGIN field is now coded using UN region and country codes.
    • Added optional SEROHOW field indicating how the seroconversion date was determined.
    • Added optional CENS_D field holding the last date the database was updated for a patient.
  • tblREFILL created to hold prescription refill data.
  • tblLTFU: The DEATH_R# fields now use CoDe codes for coding cause of death.
  • tblLAB:
    • Added codes for Gamma-glutamyltransferase and Prothrombin rate.
  • tblLAB_VIRO:
    • Changed code for "Other" of field VS_T to value 9.
    • Added "HBVACIGM", "HBVACIGG", "HCVBD" and "HDVA" codes for VS_ID field.
  • tblLAB_RES, tblLAB_RES_LVL_1, tblLAB_RES_LVL_2 and tblLAB_RES_LVL_3:
    • Renamed SAMP_ID to TEST_ID, as it gives a better description of what the values should encode.
  • tblLAB_RES:
  • tblLAB_RES_LVL_3:
    • ATC_CODE replaces ART_ID as HCV medication in tblMED may be encoded.
  • tblDIS:
    • Added optional DIS_ED field for the end date of the disease.
  • tblMED:
    • Added MED_RS describing the reason for stopping the treatment; needed for HCV.
  • tblAE:
    • Added a number of codes for AE_ID: ANG, ASCI, AVN, BYP, CERC, END, FIBS, FRA, HEP, HESY, LAC, LIVB, OESO, PAN and PERI.
  • QA checks have been extended and updated where necessary.

Version 1.50

Version 1.30

  • tblART: Updated list of drugs
  • tblAE:
    • added EVENT_ID as unique identifier and link to detailed tables for each event (see 1) – this replaces the optional AE_NO field.
    • added AE_SPEC to further specify an event by coding
    • a series of basic verification fields have been added to allow for tracking of event status for source documentation availability, verification of documentation (through monitoring) and final approval of the event.
    • in AE_R_Y - Relation to treatment: added more detailed codes.
  • tblLAB: added several codes for various biomarker tests.
  • tblLAB_CD4: added CD4_U as optional fields to discriminate between CD4% and CD4 cell count, so that the tblLAB_CD4 table can hold both types of measurements.
  • tblLAB_VIRO: added several codes for various virology and serology tests.
  • CaseDefinitions updated with end stage renal disease, chronic liver disease and non-AIDS defining malignancies

1:Detailed table definitions for the D:A:D events are available at http://www.cphiv.dk/HICDEP/Documents/tabid/159/Default.aspx

Version 1.25

  • tblART: Updated list of drugs
  • tblMED: Updated list of drugs
  • tblDIS:
    • Changed wording for CANO to ‘Candidiasis, oesophageal, bronchi, trachea, or lungs’
    • Added COCC - Coccidioidomycosis, disseminated or extrapulmonary
  • tblLAB: Added LAB_ST as additional field to code for type of specimen used for the measurement
  • tblLAB_CD4: Added CD4_U as additional field so the table can hold both percentage and absolute CD4 measurements
  • tblLAB_RNA: Added RNA_UL (upper limit of detection) to the list of additional fields.
  • Added more viral assays to the list of RNA_T codes
  • tblLAB_VIRO: Added unit field to tblLAB_VIRO into the general format and VS_LL (lower limit of detection), VS_UL (upper limit of detection) and VS_T (type of test) and list of tests to the list of additional fields.
  • tblLTFU: Added DEATH_RC# to code for causal relation of the DEATH_R# code to the death in order to comply with CoDe and still maintain a format to be used for cohorts not using CoDe. ICD10_# fields have been moved to the list of additional fields.

Version 1.21

Code Coding for Reason of Stopping Treatment
1.1 Virological failure
1.2 Partial virological failure
1.3 Immunological failure – CD4 drop
1.4 Clinical progression
90 Side effects – any of the above but unspecified
90.1 Comorbidity
92.1 Simplified treatment available
92.2 Treatment to complex
92.3 Drug interaction
93.1 Structured Treatment Interruption (STI) – at high CD4
94.1 Non-compliance
96 Pregnancy
97 Study treatment

Version 1.2

Version 1.1

  • tblBAS:
    • The table was split into tblBAS and tblLTFU. tblLTFU holds data on death and drop-out
    • Renamed LOS_Y to LOSS_Y
    • Renamed GAI_Y to GAIN_Y
  • tblLAB_BLP:
    • Renamed table to tblLAB_BP
    • Renamed BLP_D to BP_D
    • Renamed BLP_SYS to BP_SYS
    • Renamed BLP_DIA to BP_DIA
    • Renamed BLP_U to BP_U

Version 1.00

  • tblBas:
    • Renamed BIRTHDAY to BIRTH_D
    • Renamed FIRSTVIS to FRSVIS_D
    • Renamed REC_ART to RECART_Y
  • tblLAB:
    • LAB_U: has been dropped – please use the “unit codes/strings” as that is a safer way to code/represent the units – prefixing all “unit codes/strings” with a numeric value should however make analysis easier.
  • tblLAB_VIRO:
    • New table added to capture mainly hepatitis measurements/tests
  • tblLAB_RES:
    • SEQ_DT was added to capture the time of sequencing in order to facilitate quality assurance of the data for contamination that might have happened during the sequencing.
  • tblLAB_RES_LVL1:
    • Renamed SEQ_ST to SEQ_STAR
    • SEQ_STOP: Added to the table to specify at which position in the sequence the sequencing was terminated
  • tblLAB_RES_LVL2:
    • The table has been optimised for ease of analysis so that the mutation codes have been split into their components of amino acid position, sub position for insertions and 4 our more fields for mixtures of amino acids found in the sample

Version 0.50 and 0.90

First public versions that incorporated comments and corrections received from attendees at the 7th International Workshop on HIV Observational Databases, March 29th-30th 2003, Fiuggi, Italy and Stephen Hart.

Version 0.38

Version presented at 7th International Workshop on HIV Observational Databases, March 29th-30th 2003, Fiuggi, Italy