Sorry, we don't support your browser.  Install a modern browser

Ability to input specific totals for categorical data elements#346

When extracting categorical data elements, there’s no box to edit the total N for the data reported.

I know this is something you’ve discussed previously with @Nicole_hardy but I wondered if you had any updates/what the status is on this? Thanks :)

4 years ago

Thanks for asking! What would you like this for? Our reasoning for not offering N is that it can be inferred from the sum of individual category counts, and we don’t want to give you more work (and opportunity to make mistakes)!

4 years ago

Categorical data elements can be configured as such that the sum of the individual counts don’t make sense. For example, in the screenshot, Existing Medications is being extracted and from the supplement the totals of each group differ from the original values given in the intervention arms. e.g. beta blockers are /2521 in Vericiguat and /2519 in Placebo as shown in the supplement on the left, vs /2526 and /2524 as in the original intervention arm sizes

4 years ago

I’m perhaps a bit confused - it’s likely either that DE wasn’t properly configured, or it’s not being extracted properly. In general categories of a categorical DE should be indepedent; patients should only fall into one (exactly one) of them. Sometimes users may set up non-independent categories out of convenience, but I’d recommend to avoid this.

I’m not sure how adding N would alleviate this; are you saying sum(n) > N should be an error condition?

4 years ago

You’re right, this is actually my first experience with a categorical DE configured this way.

@Nicole_hardy Perhaps you have better insight on this?

4 years ago

Hi Karl. This happens a lot even for categorical variables that should in theory add together. As things are right now, the arm size in the top boxes is what has to dictate the sample size for the categorical variables. However, it is very common to have different sample sizes for different categorical variables. For example, the total arm size might be 100, but only 98 people have stroke locations measured and only 90 people have pre-mRS measured. We don’t have a way to configure loss to follow up for different categorical variables even if their categories should add together.

Additionally, I agree with you in principle that categorical variables should have categories that add together, however, I think that principle is limiting easy data collection and that we should relax it. For example, there are often several comorbidities listed that we group under “medical history.” Configuring separate DEs for each comorbidity would really clutter extraction and create more disorganization.

In sum, I think this is a desperately needed feature. I spent like two hours getting John a spreadsheet of the different sample sizes for 51 papers in a nest for a categorical variable (stroke location), which all categories should add together in that case.

4 years ago

I think this is a desperately needed feature

I apologize for being dense, but I don’t understand what feature you’re asking for! Are you asking for validation that sum(n) = N? A computed column for what the total sum(n) is?

I like saving your time, so I’m all ears!

4 years ago

You’re not being dense! Let’s set up a quick meeting and I’ll show you the use case and explain how this comes up. Let me know what times work. I think it’ll be a bit easier to show you the problem concretely and solutions we’ve implemented to get around it in the meantime.

4 years ago
Changed the status to
Under Consideration
4 years ago

A related comment from @Kristen Hutchison – Because there is typically a number of patients who drop out of a study, I commonly have to change the n value when extracting data elements. We aren’t able to do this in categorical variables that I know of, and it would be great if we could get a “Total” box where this N could be entered - as it is in dichotomous and continuous variables.

4 years ago