Go to previous topic
Go to next topic
Last Post 12 Feb 2011 12:00 AM by  SuperUser Account
Paediatric tables / design issues
 13 Replies
Author Messages

SuperUser Account



Basic Member


Posts:289
Basic Member


--
31 Jan 2011 11:00 AM

    Hi everybody,

    I am currently including the paediatric tables (tblDELIVERY?, tblNEWBORN, tblPREG, tblPREG_OBS and tblPregOut) into the HICDEP 1.50 draft. You can see the status quo in the above wiki articles.

    One thing standing out to me is that they are not designed in a relational way. E.g. tblNEWBORN basically consists of 28 fields, a boolean and a free-text character field for every problem that can occur. This is ok if the data is recorded and fed into the database only once all data is available. If this is not the case you end up having to modify existing data records to add information which is harder to handle correctly than a more relational layout consisting of a lookup table for all problems and the fields 'OCCURRED_Y', 'DESC'. Also, in the relational approach, adding a new problem wouldn't require adding another 2 fields to the already large table. The downside of course is that the data is not as easily accessible from a statistician's software. Similiar arguments apply to tblPREG_OBS etc.

    So here are my questions to you:

    1. when is the data contained in these tables usually recorded?
    2. where do you draw the line relational vs. flat design? What do you consider most useful?

    Any thoughts are appreciated.

    Kind regards,

    simon


    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    01 Feb 2011 12:00 AM
    From a technical point of view I agree it is better to get it as relational as possible .. the cohorts have to do some programming to create the HICDEP files anyway, so in my view, we should get the design as normalised as possible. Another advantage of this is that it trains them to think in a relational way.

    I would suggest for tblNEWBORN that we take out the abnormalities exactly as you suggest into one subtable tbl_ABNORM and also the apgar scores tbl_APGAR into another subtable, linking both to tblNEWBORN using the unique patient ID

    But that's the technical data-manager view .. lets see what clinicians think

    Charlotte

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    02 Feb 2011 12:00 AM


    since the consensus from yesterday's TC seems to be that the tables should be normalized, I went ahead and applied the suggested changes to the draft.

    thanks!

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    03 Feb 2011 12:00 AM


    Hi Simon

    I've just browsed these tables and they look great. A couple of minor comments:

    NEWBORN table: There are two "APGAR_3" fields when I think we only need one.

    DELIVERY table: It would be good to include "CHILD_ID" in this table in the case of multiple births. Also in the "overview" section where all the table names are listed, the description for this table could be changed from "Information on the delivery" to "Information on births".

    PREG_OBS table: I was wondering if we needed the "PROB_Y" variable as if the problem hadn't occurred then it wouldn't be specified in "PROB_T". But perhaps "PROB_T" is asking for cohorts to list potential problems that were treated in pregnancy, and then "PROB_Y" is asking if these problems eventually happened?

    Once we are happy with these tables I can send them round to the mother-to-child cohorts in EPPICC for their final comments, if useful?

    Thanks and best wishes Ali

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    04 Feb 2011 12:00 AM


    Hi Ali,

    APGAR_3: yep, that was a copy/paste mistake.

    DELIVERY?: (Note: I am a layman when it comes to pregnancies and deliveries, so please correct me if I'm wrong ;-))

    As far as I can see, all fields in this table provide information which relates strictly to the mother, therefore we shouldn't (need to) provide the CHILD_ID there. If needed, the connection between mother and child can always be looked up in tblNEWBORN. If there are fields which are specific to a child, they should probably be moved to a separate table since otherwise we would duplicate the information which applies only to the mother.

    PREG_OBS: In the non-normalized version it was possible to state whether a problem occurred or not, or if it's unknown whether or not it occurred. Without the "PROB_Y" field, you could no longer tell whether information was not recorded or it was recorded that the event did not happen.

    Regarding the second part of your question: I'd say PROB_Y does not specify whether the problem was treated or not, it is only concerned with the diagnosis. If the original intention was different, we should change the phrasing to be more clear.

    Kind regards,

    Simon

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    04 Feb 2011 12:00 AM


    ;-)

    Delivery: In the case of multiple births, the birth sequence number will be different, the time and possibly date of delivery will differ, and the mode of delivery could be different. These variables relate to the child, so adding the child ID would be good.

    Preg_obs: The way the table is set up now, if I was a data manager, I would only submit data in this table if one of these problems occurred. For each patient who had no problems, I wouldn't include PROB_T=1 PROB_Y=0, PROB_T=2 PROB_Y=0 etc etc though isn't this what it is asking in its current format? It seems a bit cumbersome. Could you have a more general variable which is "any obstetric problems in pregnancy? Yes/ No" in PREG table and then only complete preg_obs if there are problems?

    Thanks Ali

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    05 Feb 2011 12:00 AM


    DELIVERY: D'oh! I completely missed the birth sequence number. I added the CHILD_ID as another primary key field. Before, only the MOTHER_ID was a PK which basically prohibited a woman from ever giving birth to more than one child.

    In its current state, the variables LABOR, INTERV, INTERV_O, MEMRUP_D, MEMRUP_T, TEAR_Y, BLDLOSS, CONTREAT, and DISCHA_D need to be entered for every child born during the same delivery, which means we would need QA checks to assert that the information is identical for all children born during the same delivery. Another option would be to normalize further, i.e. split the table into DELIVERY and DELIVERY_COMMON (for lack of a better name):

    tblDELIVERY_COMMON
    MOTHER_ID
    MEMRUP_D
    MEMRUP_T
    LABOR
    INTERV
    INTERV_O
    TEAR_Y
    BLDLOSS
    CONTREAT
    DISCHA_D

    tblDELIVERY
    MOTHER_ID
    MEMRUP_D
    CHILD_ID
    B_SEQ
    DELIV_D
    DELIV_T
    DELIV_M
    LABOR_P

    the only check needed in this scenario is that there exists a record in DELIVERY_COMMON with the MOTHER_ID and MEMRUP_D given in DELIVERY. Any comments on this layout?

    PREG_OBS: I see your point. I assumed the data managers have tools which allow convenient entering (e.g. a form with a list of checkboxes) and write the results in a relational format, but that may just be wishful thinking on my side. I don't see what the 'any problems?' variable would be good for, though. If we decide to just distinguish yes/no then we can drop PROB_Y entirely, but again, this reduces the information available. I lack the knowledge to judge whether that loss is acceptable or not, but if it is I can go ahead and change the table.

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    06 Feb 2011 12:00 AM


    Hi Simon and Charlotte

    Simon, I like the way you have split the DELIVERY table. Could you name the two tables DELIVERY_MUM and DELIVERY_CHILD?

    Charlotte, do you have any comments on the way that PREG_OBS is set up at the moment, or how it can be improved?

    Thanks Ali

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    07 Feb 2011 12:00 AM
    Hello,

    Since during the TC of april 5th I wasn't able to express myself clearly, I would like to clarify the problem I am facing here:

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    08 Feb 2011 12:00 AM
    if the following table is provided:
    MOTHER_ID MEMRUP_D PROB_T PROB_Y CERVIX_S
    1234 2000-01-01 Hypertension Yes .
    1234 2000-01-01 Preterm contractions No .

    These are the statements being made:

    Mother 1234 suffered from hypertension during the delivery of 2000-01-01.
    Mother 1234 did not suffer from preterm contractions during the delivery of 2000-01-01
    It is unknown whether any other problems occurred or not.

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    09 Feb 2011 12:00 AM
    compare this to the following approach:
    MOTHER_ID MEMRUP_D PROB_T CERVIX_S
    1234 2000-01-01 Hypertension .

    These are the statements being made:

    Mother 1234 suffered from hypertension during the delivery of 2000-01-01.
    It is unknown whether any other problems occurred or not.

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    10 Feb 2011 12:00 AM
    Here is my question to the group:

    Which scheme is more valuable to a study investigating these variables? Do you have any other comments?

    Thanks!

    Simon

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    11 Feb 2011 12:00 AM
    Firstly, I think that we need a field PREGOBS_Y in the parent table PREG which means "Did this patient have any obstetrical problems during this pregancy?" .. and this should have the codes No, Yes or Unknown. If no or unknown, there should be no records in PREG_OBS, but if yes, then PREG_OBS will have some records.

    Secondly, how do we code them in PREG_OBS? We don't want the cohorts to have to program in every single problem if it didn't occur or if we don't know if it occurred. Furthermore, if we were to allow cohorts the possibility of recording that it definitely didn't occur (ie putting PROB_Y = NO), then we would not be able to tell whether it definitely didn't occur or whether we didn't record the fact that it didn't occur! So, to avoid confusion, and erroneous claims from the data, I am sure that it is better not to have the field PROB_Y at all. So .. best to use you second approach .. then if someone asks the question of the data "in how many patients did hypertension occur?", you can count how many hypertension records there are in PREG_OBS and your total patients for this query will be the number of pregnancies in which PREGOBS_Y in the parent table is set to Yes. This is in line with how we record everything else like MED, DIS etc

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    12 Feb 2011 12:00 AM


    Thanks everybody for their input. I incorporated the proposed changes into the draft articles the way Charlotte suggested, since I think it's the most consistent and workable approach. If you see potential for further improvement, let the forum know!

    Kind regards,

    Simon


    ---